[pet] problem with generic entries, suffixes and punctuation
Ben Waldron
ben.waldron at hf.ntnu.no
Mon Oct 15 18:52:16 CEST 2007
I think you will want to disable the suffix checking in the PET config
file. See the diff below.
- Ben
bmw20 at t41:~$ diff erg/pet/common.set
sciborg/tools/parsing/erg/pet/common.set
109,130c109
< ;;;
< ;;; some generic lexical entries require inflectional marking. this
mechanism
< ;;; is a filter on which generic entries proposed by other means can
survive:
< ;;; generic entries listed here will only be postulated if the
required suffix
< ;;; can be matched against the input token.
< ;;;
< ;;; when using only generic entries licensed by a POS tag, the suffix
filter
< ;;; really does not make a lot of sense anymore.
(6-jun-03; oe)
< ;;;
< generic-le-suffixes :=
< $generic_trans_verb_pres3sg "S"
< $generic_trans_verb_past "ED"
< $generic_trans_verb_psp "ED"
< $generic_trans_verb_prp "ING"
< $generic_pl_noun "S"
< ;;
< ;; when running without a POS tagger, effectively disable a few generics
< ;;
< $generic_adj_compar "_block_"
< $generic_adj_superl "_block_"
< .
<
---
> ;; [sciborg] don't want generic-le-suffixes
131a111
Rebecca Dridan wrote:
> Hi all,
>
> I found a problem this week with the cheap add_generics() code and the
> fsr tokenisation method. Not sure why I haven't noticed before, except
> that the trigger is not that frequent...
>
> Using -tok=fsr, the last token of a sentence I'm parsing is
> "retirees.", with the period included. When I use those same tokens,
> but add POS tags to get the default les, a $generic_pl_noun item is
> not created, because the suffix check fails - the last character is
> "." rather than "s". Not sure where the fix belongs, in the suffix
> checking code or elsewhere, but perhaps someone can take a look at it?
>
> Thanks,
>
> Rebecca
>
--
Mer og mer av vår import kommer utenfra.
More information about the pet
mailing list