[developers] Handling unknown lexical entries with ACE parsing

Woodley Packard sweaglesw at sweaglesw.org
Fri Jan 13 00:55:44 CET 2017


Hi Olga and Francis,

If what you want to do is define certain generic lexemes that apply to all words, you do not need a POS tagger at all.  You will need to enable token mapping, but not necessarily write any token mapping rules.  Next you need to create some lexical entries whose TDL status is "generic-lex-entry".  Each one of these will get instantiated on every token, so some caution will be wise.  The token feature structure gets unified into a grammar-defined path into the lexeme it licenses, so the explosion can be controlled by making constraints at that path.  When POS tagging is used, the POS info lives on the token feature structures, and typical unknown-word-aware DELPH-IN grammars use constraints on generic lexical entries’ tokens’ POS values to select which generic lexical entry applies in which POS situations.  You can also use the so-called "lexical filtering" stage to throw out some generic (or native if you want) lexical entries.  See the ERG’s lfr.tdl for examples (Dan uses this to discard generic lexemes proposed by the tagger in situations where the grammar has native lexical coverage).

If you find that you need help getting some portions of this airborne, let me know.

Regards,
Woodley

> On Jan 10, 2017, at 9:28 PM, Francis Bond <bond at ieee.org> wrote:
> 
> There is some discussion here (for PET):
> http://moin.delph-in.net/PetInput <http://moin.delph-in.net/PetInput>
> 
> the chart mapping approach also works for ACE.
> 
> You have to use an external POS tagger (which you could fake if all unknown words get the same POS).
> 
> We have this working for Zhong and Jacy (and I think INDRA).  I would be happy to walk you through it if you promise to enhance the documentation :-).
> 
> 
> On Tue, Jan 10, 2017 at 4:57 PM, Olga Zamaraeva <olzama at uw.edu <mailto:olzama at uw.edu>> wrote:
> Dear Developers,
> 
> I would like to parse some text (with ACE) using a small grammar and I am likely to encounter stems that I do not have in the lexicon. My understanding is that it is possible to add a generic lexical entry for e.g. "verb", and analyze some of the unknown words morphologically this way. I am looking for any documentation/advice on how this is done. Would anyone be able to point me to anything?
> 
> 
> Thank you,
> Olga
> 
> 
> 
> -- 
> Francis Bond <http://www3.ntu.edu.sg/home/fcbond/ <http://www3.ntu.edu.sg/home/fcbond/>>
> Division of Linguistics and Multilingual Studies
> Nanyang Technological University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170112/f0edcbe9/attachment.html>


More information about the developers mailing list