[developers] cfrom/cto in MRS

Paul Haley paul at haleyai.com
Mon Apr 2 15:21:33 CEST 2012


Hi Rebecca,

First, I'm trying to figure out Stephan's suggestions at the end of his
follow-up...  They look promising.

Chart mapping is good! (as are ALL the improvements in NG, as far as we are
concerned - except this one little problem with generics...)

I understand that unknown words introduce some combinatorics, but this is
not a problem for our application.  More important and useful is handling of
(and insights about) unknown words (a unique capability of
constraint/unification-based approaches?).

We can send FSC for unknown words with (for example) uniform probabilities
of an unknown word being a noun, verb, adjective, or adverb, but it seems
awkward or impractical to do so without losing functionality.
 
We do send FSC, but only when we want to prune the search space, which is
(essentially) never on the first attempt at parsing.




Thank you!
Paul

P.S. Is there a way in FSC to indicate phrasal or clausal boundaries (other
than MWEs)?


-----Original Message-----
From: bec.dridan at gmail.com [mailto:bec.dridan at gmail.com] On Behalf Of
Rebecca Dridan
Sent: Friday, March 30, 2012 1:48 PM
To: Paul Haley
Cc: developers at delph-in.net
Subject: Re: [developers] cfrom/cto in MRS

Hi Paul,

To the best of my knowledge, using PET with the ERG without chart-mapping is
basically deprecated now. Among other things, the characterisation
information you want is passed up using chart-mapping rules.

If I understand correctly, you want to parse utterances containing unknown
words without using a POS tagger? That's not a default setup (and will
probably lead to very slow parsing), so I think you would need to add
appropriate chart-mapping rules to the grammar to get that behaviour.  Or
maybe I am misunderstanding your goal?

Rebecca

On Fri, Mar 30, 2012 at 18:55, Paul Haley <paul at haleyai.com> wrote:
> Here's a dump of the issue, FYI.
>
>
> Essentially, PET appears not to maintain the MRS linkage to tokens if 
> either generics are involved or chart mapping is not used, which seems 
> inappropriate in either case.
>
>
> With chart mapping, undefined words are not recognized:
>
> build/debug/cheap/cheap (0.99.14svn_cm $Change: 850 $) -nsolutions=1
> -verbose=4 -mrs=new -default-les=all -cm ../ERG/english.grm
>
> Eukaryotic cells contain mitochondria.
> ...
> no lexicon entries for:
>     "eukaryotic"
>     "mitochondria."
> ...
>
> Without chart mapping the words are recognized but the MRS loses 
> reference to the chart (i.e., by token position):
>
> Eukaryotic cells contain mitochondria.
> ...
> <mrs>
> <label vid='1'/><var vid='2'/>
> <ep cfrom='-1' cto='-1'><pred>UNKNOWN_REL</pred><label vid='1'/> ...
>
> The following shows that when the words are known to the ERG the MRS 
> has the position information in chart mapping mode:
>
> this is a test.
> ...
> <mrs>
> <label vid='1'/><var vid='2'/>
> <ep cfrom='0' cto='1'><pred>GENERIC_ENTITY_REL</pred><label vid='3'/> 
> ...
>
> Thanks again, and sorry for omitting the detail from the prior email.
>
> Paul
>
>
> On 03/30/2012 11:55 AM, Paul Haley wrote:
>
> Hello again,
>
> I was able to isolate the change in my environment to the use of the 
> chart mapping option.
>
> Apparently, the from/to attributes of the MRS (shown here around a 
> colon) are -1 unless chart-mapping is selected:
>
>     [ LTOP: h1  INDEX: e2 [ e SF: PROP TENSE: PAST MOOD: INDICATIVE 
> PROG: -
> PERF: - ]  RELS: <   [ appos_rel<0:36>
>
> I dropped the -cm intending to receive the explosion of generics 
> discussed in the "unknown word handling and chart mapping section" of 
> http://moin.delph-in.net/PetInput.
>
> This was intentional since we are looking at the chart in detail, 
> extracting "insights" from PET/ERG, and for this reason want (as much) 
> mapping information from MRS to the chart (as practical).
>
> The from/to is helpful, but more direct linkage between elementary 
> predications and the chart would be even better.
>
> Regards,
> Paul
>
> P.S. We send FSC, too, and would appreciate advice or examples on how 
> to constrain those with deeper semantics (as in the excellent 
> discussion at
> http://moin.delph-in.net/SuquamishMRSWordNet).)
>
>
> On 03/29/2012 02:18 PM, Paul Haley wrote:
>
> Greetings,
>
>
>
> I would appreciate any advice on how to get the cfrom/cto information 
> output in the new MRS with the latest PET and ERG.
>
>
>
> Thank you,
>
> Paul
>
>
>
> Paul Haley
>
> Automata, Inc.
>
> (412) 716-6420
>
>





More information about the developers mailing list