[developers] Re: small ERG lexicon
benjamin.waldron at cl.cam.ac.uk
Fri Apr 22 17:34:57 CEST 2005
Ann Copestake wrote:
>One more thing - it would be very useful for me to have a smaller lexicon
>available for the ERG. I suspect this is true for other people too if they
>can't use the db or don't want to. I think the ideal situation would be if a
>lexicon could be dumped that contains all the words that might be accessed when
>processing the CSLI test suite (i.e., include all senses and all MWES which
>have a match on the RHS). If this would be easy to do automatically, perhaps
>you could add something to the routine which dumps the main lexicon in TDL
>format? See what Dan thinks anyway. It would need to be done in such a way
>that it wasn't any extra work for him. I think you can get the list of words
>used from the fine system.
I've created a function (dump-small-lexicon) to perform the above task.
The lex ids are taken from *lex-ids-used*. The file name can be
specified using :file, but also defaults to something sensible.
erg/lexicon-small.tdl (derived from the CSLI test suite) is now in the CVS.
Stephan, would it be possible to hook functions like the above on to
tsdb::finalize-run? That way it could be called automatically only when
meaningful to do so.
More information about the developers