[pet] problem with generic entries, suffixes and punctuation
Rebecca Dridan
bec.dridan at gmail.com
Mon Oct 15 17:19:33 CEST 2007
Hi all,
I found a problem this week with the cheap add_generics() code and the
fsr tokenisation method. Not sure why I haven't noticed before, except
that the trigger is not that frequent...
Using -tok=fsr, the last token of a sentence I'm parsing is "retirees.",
with the period included. When I use those same tokens, but add POS tags
to get the default les, a $generic_pl_noun item is not created, because
the suffix check fails - the last character is "." rather than "s". Not
sure where the fix belongs, in the suffix checking code or elsewhere,
but perhaps someone can take a look at it?
Thanks,
Rebecca
More information about the pet
mailing list