[erg] a question of parenthesised numbers

David Mott mottdh at googlemail.com
Tue Sep 17 12:04:14 CEST 2013


oe

Ah, after your last email now i am calling PET as you said above. But when
I tried this out originally (2 mths ago) I called it without the -sm -repp
-cm stuff and I did not think to try it again with your new parameters.
Strangely I was just looking at the output after making "male" a mass noun
(which does now parse) and I saw "generic_card_ne" in the parse, but didnt
think to question that. See, it was all under my eyes all the time. Sorry.

Your pointer to the tmr files is illuminating, and should allow me to
remove loads of special entries in the lexicon.

thank you

David


On 17 September 2013 10:53, Stephan Oepen <oe at ifi.uio.no> wrote:

> hi david,
>
> how exactly do you invoke PET nowadays?
>
> $ echo "a person (1234543265)" | cheap -sm -repp -cm -default-les=all
> -verbose=3 english.grm
>
> [...]
>
> derivation[1] (0):a person (1234543265)
>
> (117 np_frg_c 0 0 3 [root_inffrag]
>   (116 sp-hd_n_c 0 0 3
>     (41 a_det/d_-_sg-nmd_le 0 0 1 []
>       (31 "a" 0 0 1 <0:1>))
>     (114 hdn-n_prnth_c 0 1 3
>       (98 n_ms-cnt_ilr 0 1 2
>         (81 person_n1/n_-_mc_le 0 1 2 []
>           (30 "person" 0 1 2 <2:8>)))
>       (105 num_prt-det-nc_c 0 2 3
>         (95 w_lparen_plr 0 2 3
>           (93 w_rparen_plr 0 2 3 [w_lparen_plr]
>             (88 generic_card_ne/aj_-_i-crd-gen_le 0 2 3 [w_rparen_plr
> w_lparen_plr]
>               (36 "(1234543265)" 0 2 3 <9:21>))))))))
>
> there are in total eight analyses, all with the number
> as a cardinal adjective.  arguably, we should maybe
> also allow an analysis as an identifier NE, but that
> would cost substantially in ambiguity ...
>
> you can look at tmr/ne{1,2,3}.tdl in the ERG source
> for the patterns we recognize through what we call
> lightweight named entity recognition.
>
> best, oe
>
> On Tue, Sep 17, 2013 at 9:35 AM,  <mottdh at googlemail.com> wrote:
> > Hi
> >
> > i am trying to parse a sentence that has the fragment
> >
> >   a person (1234543265)
> >
> > I think the ERG treats the parenthetical item as an adjective, which is
> fine. but the number itself is not recognised, and the parse fails. (it
> works if i manually add the number to the lexicon!).
> >
> > my question is: how do i get numbers (and dates by the way) to be
> recognised as a particular lexical type as if in the lexicon? i am guessing
> that this is a preparsing issue, and would welcome some pointers to where
> this might be described.
> >
> > thank you
> >
> > David Mott
> >
> > Sent from my iPad
>
>
>
> --
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284
> 0125
> +++    --- oe at ifi.uio.no; stephan at oepen.net; http://www.emmtee.net/oe/ ---
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/erg/attachments/20130917/7700f0e7/attachment.html>


More information about the erg mailing list