<div dir="ltr"><div><div>oe<br><br></div>Ah, after your last email now i am calling PET as you said above. But when I tried this out originally (2 mths ago) I called it without the -sm -repp -cm stuff and I did not think to try it again with your new parameters. Strangely I was just looking at the output after making "male" a mass noun (which does now parse) and I saw "generic_card_ne" in the parse, but didnt think to question that. See, it was all under my eyes all the time. Sorry.<br> <br></div><div>Your pointer to the tmr files is illuminating, and should allow me to remove loads of special entries in the lexicon.<br><br>thank you<br></div><div><br></div><div>David<br></div></div><div class="gmail_extra"> <br><br><div class="gmail_quote">On 17 September 2013 10:53, Stephan Oepen <span dir="ltr"><<a href="mailto:oe@ifi.uio.no" target="_blank">oe@ifi.uio.no</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> hi david,<br> <br> how exactly do you invoke PET nowadays?<br> <br> $ echo "a person (1234543265)" | cheap -sm -repp -cm -default-les=all<br> -verbose=3 english.grm<br> <br> [...]<br> <br> derivation[1] (0):a person (1234543265)<br> <br> (117 np_frg_c 0 0 3 [root_inffrag]<br> (116 sp-hd_n_c 0 0 3<br> (41 a_det/d_-_sg-nmd_le 0 0 1 []<br> (31 "a" 0 0 1 <0:1>))<br> (114 hdn-n_prnth_c 0 1 3<br> (98 n_ms-cnt_ilr 0 1 2<br> (81 person_n1/n_-_mc_le 0 1 2 []<br> (30 "person" 0 1 2 <2:8>)))<br> (105 num_prt-det-nc_c 0 2 3<br> (95 w_lparen_plr 0 2 3<br> (93 w_rparen_plr 0 2 3 [w_lparen_plr]<br> (88 generic_card_ne/aj_-_i-crd-gen_le 0 2 3 [w_rparen_plr<br> w_lparen_plr]<br> (36 "(1234543265)" 0 2 3 <9:21>))))))))<br> <br> there are in total eight analyses, all with the number<br> as a cardinal adjective. arguably, we should maybe<br> also allow an analysis as an identifier NE, but that<br> would cost substantially in ambiguity ...<br> <br> you can look at tmr/ne{1,2,3}.tdl in the ERG source<br> for the patterns we recognize through what we call<br> lightweight named entity recognition.<br> <br> best, oe<br> <div class="HOEnZb"><div class="h5"><br> On Tue, Sep 17, 2013 at 9:35 AM, <<a href="mailto:mottdh@googlemail.com">mottdh@googlemail.com</a>> wrote:<br> > Hi<br> ><br> > i am trying to parse a sentence that has the fragment<br> ><br> > a person (1234543265)<br> ><br> > I think the ERG treats the parenthetical item as an adjective, which is fine. but the number itself is not recognised, and the parse fails. (it works if i manually add the number to the lexicon!).<br> ><br> > my question is: how do i get numbers (and dates by the way) to be recognised as a particular lexical type as if in the lexicon? i am guessing that this is a preparsing issue, and would welcome some pointers to where this might be described.<br> ><br> > thank you<br> ><br> > David Mott<br> ><br> > Sent from my iPad<br> <br> <br> <br> </div></div><span class="HOEnZb"><font color="#888888">--<br> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br> +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; <a href="tel:%28%2B47%29%202284%200125" value="+4722840125">(+47) 2284 0125</a><br> +++ --- <a href="mailto:oe@ifi.uio.no">oe@ifi.uio.no</a>; <a href="mailto:stephan@oepen.net">stephan@oepen.net</a>; <a href="http://www.emmtee.net/oe/" target="_blank">http://www.emmtee.net/oe/</a> ---<br> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br> </font></span></blockquote></div><br></div>