[developers] New ERG with improved tokenization/preprocessing for PET
tim at csse.unimelb.edu.au
Tue May 26 01:26:42 CEST 2009
<SNIP> (that's a big snip!)
> > i think an actual solution would require a tool like morpha (which is
> > part of pre-processing in RASP, i believe), adapted for PTB tags and
> > american english. one could argue this /should/ be part of our input
> > pre-processing prior to parsing, but that is not an option right now.
> Why is it not an option? I thought Tim has something like this
> already. If not we could make one, or maybe see if the RASP project
> has one squirreled away somewhere. If we can't find anything better,
> I volunteer to make one: under the assumption that irregular cases
> should go in the ERG proper, I believe it would be reasonably cheap to
Tim does indeed have such a thing if needed. Or were you referring to more
technical reasons for it not being an option, Stephan?
More information about the developers