[developers] cfrom/cto in MRS

Stephan Oepen oe at ifi.uio.no
Sun Apr 1 22:17:05 CEST 2012


hi paul,

as bec points out, unknown word handling in the ERG is tightly linked to (a) PoS tags on input tokens, to activate compatible generic lexical entries and (b) chart mapping, to apply light-weight NE recognition, e.g. numbers and such.

in your example, i assume there were no PoS tags, nor do any of the NE heuristics fire (sentence-initial capitals do /not/ trigger unknown names); hence, i would indeed expect the behavior you report.

that your unknown tokens trigger unknown words when chart mapping is disabled (which really no longer is a supported configuration for the ERG, i.e. something i would strongly advise against) probably is owed to to your FSC inputs leaving the PoS list in the token FSs underspecified.  chart mapping will ‘tighten’ such underspecification to an empty list, which will then block activation of PoS-based generics.

if you really wanted all generics for all unknown words (which i would not expect to scale beyond relatively short sentences), i concur with bec: you would have to adapt the chart mapping rules in (minimally) ‘tmr/tnt.tdl’—or maybe change the generics to have an empty PoS list?

best, oe


On 30. mars 2012, at 18:55, Paul Haley <paul at haleyai.com> wrote:

> Here's a dump of the issue, FYI.  
> 
> 
> Essentially, PET appears not to maintain the MRS linkage to tokens if either generics are involved or chart mapping is not used, which seems inappropriate in either case.
> 
> 
> With chart mapping, undefined words are not recognized:
> build/debug/cheap/cheap (0.99.14svn_cm $Change: 850 $) -nsolutions=1 -verbose=4 -mrs=new -default-les=all -cm ../ERG/english.grm
> 
> Eukaryotic cells contain mitochondria.
> ...
> no lexicon entries for:
>     "eukaryotic"
>     "mitochondria."
> ...
> Without chart mapping the words are recognized but the MRS loses reference to the chart (i.e., by token position):
> 
> Eukaryotic cells contain mitochondria.
> ...
> <mrs>
> <label vid='1'/><var vid='2'/>
> <ep cfrom='-1' cto='-1'><pred>UNKNOWN_REL</pred><label vid='1'/>
> ...
> The following shows that when the words are known to the ERG the MRS has the position information in chart mapping mode:
> 
> this is a test.
> ...
> <mrs>
> <label vid='1'/><var vid='2'/>
> <ep cfrom='0' cto='1'><pred>GENERIC_ENTITY_REL</pred><label vid='3'/>
> ...
> 
> Thanks again, and sorry for omitting the detail from the prior email.
> 
> Paul
> 
> On 03/30/2012 11:55 AM, Paul Haley wrote:
>> Hello again,
>> 
>> I was able to isolate the change in my environment to the use of the chart mapping option.
>> 
>> Apparently, the from/to attributes of the MRS (shown here around a colon) are -1 unless chart-mapping is selected: 
>> 
>>     [ LTOP: h1  INDEX: e2 [ e SF: PROP TENSE: PAST MOOD: INDICATIVE PROG: - PERF: - ]  RELS: <   [ appos_rel<0:36>
>> 
>> I dropped the -cm intending to receive the explosion of generics discussed in the "unknown word handling and chart mapping section"       of http://moin.delph-in.net/PetInput.
>> 
>> This was intentional since we are looking at the chart in detail, extracting "insights" from PET/ERG, and for this reason want (as much) mapping information from MRS to the chart (as practical).
>> 
>> The from/to is helpful, but more direct linkage between elementary predications and the chart would be even better.
>> 
>> Regards,
>> Paul
>> 
>> P.S. We send FSC, too, and would appreciate advice or examples on how to constrain those with deeper semantics (as in the excellent discussion at http://moin.delph-in.net/SuquamishMRSWordNet).)
>> 
>> 
>> On 03/29/2012 02:18 PM, Paul Haley wrote:
>>> 
>>> Greetings,
>>>  
>>> I would appreciate any advice on how to get the cfrom/cto information output in the new MRS with the latest PET and ERG.
>>>  
>>> Thank you,
>>> Paul
>>>  
>>> Paul Haley
>>> Automata, Inc.
>>> (412) 716-6420
>>>  
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20120401/144eae95/attachment.html>


More information about the developers mailing list