<div dir="ltr"><div><div>oe<br><br></div>Ah, after your last email now i am calling PET as you said above. But when I tried this out originally (2 mths ago) I called it without the -sm -repp -cm stuff and I did not think to try it again with your new parameters. Strangely I was just looking at the output after making "male" a mass noun (which does now parse) and I saw "generic_card_ne" in the parse, but didnt think to question that. See, it was all under my eyes all the time. Sorry.<br>
<br></div><div>Your pointer to the tmr files is illuminating, and should allow me to remove loads of special entries in the lexicon.<br><br>thank you<br></div><div><br></div><div>David<br></div></div><div class="gmail_extra">
<br><br><div class="gmail_quote">On 17 September 2013 10:53, Stephan Oepen <span dir="ltr"><<a href="mailto:oe@ifi.uio.no" target="_blank">oe@ifi.uio.no</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
hi david,<br>
<br>
how exactly do you invoke PET nowadays?<br>
<br>
$ echo "a person (1234543265)" | cheap -sm -repp -cm -default-les=all<br>
-verbose=3 english.grm<br>
<br>
[...]<br>
<br>
derivation[1] (0):a person (1234543265)<br>
<br>
(117 np_frg_c 0 0 3 [root_inffrag]<br>
(116 sp-hd_n_c 0 0 3<br>
(41 a_det/d_-_sg-nmd_le 0 0 1 []<br>
(31 "a" 0 0 1 <0:1>))<br>
(114 hdn-n_prnth_c 0 1 3<br>
(98 n_ms-cnt_ilr 0 1 2<br>
(81 person_n1/n_-_mc_le 0 1 2 []<br>
(30 "person" 0 1 2 <2:8>)))<br>
(105 num_prt-det-nc_c 0 2 3<br>
(95 w_lparen_plr 0 2 3<br>
(93 w_rparen_plr 0 2 3 [w_lparen_plr]<br>
(88 generic_card_ne/aj_-_i-crd-gen_le 0 2 3 [w_rparen_plr<br>
w_lparen_plr]<br>
(36 "(1234543265)" 0 2 3 <9:21>))))))))<br>
<br>
there are in total eight analyses, all with the number<br>
as a cardinal adjective. arguably, we should maybe<br>
also allow an analysis as an identifier NE, but that<br>
would cost substantially in ambiguity ...<br>
<br>
you can look at tmr/ne{1,2,3}.tdl in the ERG source<br>
for the patterns we recognize through what we call<br>
lightweight named entity recognition.<br>
<br>
best, oe<br>
<div class="HOEnZb"><div class="h5"><br>
On Tue, Sep 17, 2013 at 9:35 AM, <<a href="mailto:mottdh@googlemail.com">mottdh@googlemail.com</a>> wrote:<br>
> Hi<br>
><br>
> i am trying to parse a sentence that has the fragment<br>
><br>
> a person (1234543265)<br>
><br>
> I think the ERG treats the parenthetical item as an adjective, which is fine. but the number itself is not recognised, and the parse fails. (it works if i manually add the number to the lexicon!).<br>
><br>
> my question is: how do i get numbers (and dates by the way) to be recognised as a particular lexical type as if in the lexicon? i am guessing that this is a preparsing issue, and would welcome some pointers to where this might be described.<br>
><br>
> thank you<br>
><br>
> David Mott<br>
><br>
> Sent from my iPad<br>
<br>
<br>
<br>
</div></div><span class="HOEnZb"><font color="#888888">--<br>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; <a href="tel:%28%2B47%29%202284%200125" value="+4722840125">(+47) 2284 0125</a><br>
+++ --- <a href="mailto:oe@ifi.uio.no">oe@ifi.uio.no</a>; <a href="mailto:stephan@oepen.net">stephan@oepen.net</a>; <a href="http://www.emmtee.net/oe/" target="_blank">http://www.emmtee.net/oe/</a> ---<br>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>
</font></span></blockquote></div><br></div>