[developers] Re: redwoods and mrs

Stephan Oepen oe at csli.Stanford.EDU
Sat May 14 01:11:42 CEST 2005


hi anton,

good to see you actively using some of the LinGO stuff!  i am copying a
larger list on some replies, so others get a chance to comment too.

> is there a simple way to extract (r)mrs's from redwoods?
> I'd like to study how verbs cluster based on what occurs in their ARG
> slots.  Would another good way to go about getting enough data for this be
> parsing sentences from eg GCIDE with PET?

starting from Redwoods treebanks, you get the benefit of only looking
at hand-selected analyses, i.e. the chance of seeing correct MRSs for
the actually preferred analysis is by far higher.  parsing GCIDE data
with PET, you would depend on the stochastic parse selection machinery
to make a selection among analyses for you, and (lacking training data)
we have never trained a model specifically for GCIDE.  even training on
one of the Redwoods domains, parse selection only achieves some eighty
per cent accuracy, so manually annotated data will always be better.

exporting MRSs from Redwoods should be straightforward.  take a look at

  http://wiki.delph-in.net/moin/RedwoodsTop

and then consider adapting the `export' script supplied already.  this
will likely work best in a Linux environment with plenty (> 1.5 gbyte)
of RAM, though.

                                                   all the best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (ILN); Boks 1102 Blindern; 0317 Oslo; (+47) 2285 7989
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at csli.stanford.edu; oe at hf.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



More information about the developers mailing list