[pet] problem with generic entries, suffixes and punctuation

Ben Waldron ben.waldron at hf.ntnu.no
Mon Oct 15 18:52:16 CEST 2007


I think you will want to disable the suffix checking in the PET config 
file. See the diff below.

- Ben

bmw20 at t41:~$ diff erg/pet/common.set 
sciborg/tools/parsing/erg/pet/common.set
109,130c109
< ;;;
< ;;; some generic lexical entries require inflectional marking.  this 
mechanism
< ;;; is a filter on which generic entries proposed by other means can 
survive:
< ;;; generic entries listed here will only be postulated if the 
required suffix
< ;;; can be matched against the input token.
< ;;;
< ;;; when using only generic entries licensed by a POS tag, the suffix 
filter
< ;;; really does not make a lot of sense anymore.               
(6-jun-03; oe)
< ;;;
< generic-le-suffixes :=
<   $generic_trans_verb_pres3sg "S"
<   $generic_trans_verb_past "ED"
<   $generic_trans_verb_psp "ED"
<   $generic_trans_verb_prp "ING"
<   $generic_pl_noun "S"
<   ;;
<   ;; when running without a POS tagger, effectively disable a few generics
<   ;;
<   $generic_adj_compar "_block_"
<   $generic_adj_superl "_block_"
< .
<
---
 > ;; [sciborg] don't want generic-le-suffixes
131a111


Rebecca Dridan wrote:
> Hi all,
>
> I found a problem this week with the cheap add_generics() code and the 
> fsr tokenisation method. Not sure why I haven't noticed before, except 
> that the trigger is not that frequent...
>
> Using -tok=fsr, the last token of a sentence I'm parsing is 
> "retirees.", with the period included. When I use those same tokens, 
> but add POS tags to get the default les, a $generic_pl_noun item is 
> not created, because the suffix check fails - the last character is 
> "." rather than "s". Not sure where the fix belongs, in the suffix 
> checking code or elsewhere, but perhaps someone can take a look at it?
>
> Thanks,
>
> Rebecca
>


-- 
Mer og mer av vår import kommer utenfra.




More information about the pet mailing list