[pet] handling of unknown lexical items

Rebecca Dridan bec.dridan at gmail.com
Wed Oct 26 21:12:09 CEST 2011


Are you using any sort of POS tagger to annotate the input to PET? I 
assume the online demo is using the TnT tagger, which is the default.  
How you feed those into the parser depends a bit on which version of the 
parser and the grammar you are using, but you'll definitely want 
POS-tagged input to get decent unknown word handling.

Rebecca

On 26/10/11 8:48 PM, John Stewart wrote:
> Hello,
>
> I am trying to reproduce the behaviour of the online demo using the
> command-line PET + ERG system, but having various troubles.  One is
> with unknown words.  For  the sentence
>
> (1)  ugo kissed pilar
>
> The online demo returns _ugo/nn_u_unknown and _pilar/nn_u_unknown ,
> which is correct.
>
> Using the command-line tool as follows:
>
>> cheap -default-les=all -verbose=3 -mrs english.grm
> I get the surprising output:
>
> (1011 np_frg_c 0 0 3 [root_frag]
>    (1007 hdn_bnp_c 0 0 3
>      (1003 n-hdn_cpd_c 0 0 3
>        (5 gen_generic_noun/n_-_mc-ns-g_le 0 0 1 []
>          (1 "ugo" 0 0 1<0:1>))
>        (1000 hdn-n_prnth_c 0 1 3
>          (610 generic_pl_noun/n_-_c-pl-unk_le 0 1 2 []
>            (2 "kissed" 0 1 2<1:2>))
>          (865 generic_pl_noun_ne/n_-_c-pl-gen_le 0 2 3 []
>            (3 "pilar" 0 2 3<2:3>))))))
>
> So an NP fragment.  Incidentally I'm unsure how to read the leaf
> types, as the format, with "/", seems to not match the templates
> documented at http://moin.delph-in.net/ErgLeTypes  But in any case,
> plural nouns are an incorrect default (I get worse results with
> -default-les=traditional).  Are there cheap switches that will yield
> the better output given by the online demo?
>
> Thanks for any suggestions.
>
> jds
>




More information about the pet mailing list