[developers] Punctuation and "-default-les" type mapping in PET/ERG

R. Bergmair rb432 at cam.ac.uk
Fri Apr 11 21:02:24 CEST 2008


I've played around with this some more.

...to summarize what this has been about so far.

I call CHEAP as follows:


nice -n 40 $DELPHINHOME/pet/bin/cheap \
   -results=10 \
   -nsolutions=5 \
   -limit=12288 \
   -packing=15 \
   -mrs=rmrx \
   -tok=smaf \
   -default-les \
   $DELPHINHOME/erg/english.grm


Then I run a SMAF that looks like this:


<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE smaf SYSTEM 
"file:///auto/homes/rb432/workspace/PyRMRS/dtd/smaf.dtd">
<smaf>
   <lattice init="v0" final="v4">
     <edge type="token" id="t1" source="v0" target="v1" cfrom="0" cto="8">Reptiles</edge>
     <edge type="token" id="t2" source="v1" target="v2" cfrom="9" cto="13">have</edge>
     <edge type="token" id="t3" source="v2" target="v3" cfrom="14" cto="16">no</edge>
     <edge type="token" id="t4" source="v3" target="v4" cfrom="17" cto="21">fur.</edge>
     <edge type="pos" id="p0" source="v0" target="v1" cfrom="0" cto="8" deps="t1">
       <slot name="tag">NN2</slot>
     </edge>
     <edge type="pos" id="p1" source="v1" target="v2" cfrom="9" cto="13" deps="t2">
       <slot name="tag">VH0</slot>
     </edge>
     <edge type="pos" id="p2" source="v2" target="v3" cfrom="14" cto="16" deps="t3">
       <slot name="tag">AT</slot>
     </edge>
     <edge type="pos" id="p3" source="v3" target="v4" cfrom="17" cto="21" deps="t4">
       <slot name="tag">NN1</slot>
     </edge>
   </lattice>
</smaf>


PET returns some results on this, but also outputs some warnings
that look a bit worrying:


  WARNING: failed to unify new path-value (SYNSEM.LKEYS.KEYREL.CARG = have) 
into fs (type: v_np-prd_oeq-ntr_le)
; WARNING: failed to unify new path-value (SYNSEM.LKEYS.KEYREL.CARG = 
have) into fs (type: v_np-vpslnp_oeq_le)
; WARNING: failed to unify new path-value (SYNSEM.LKEYS.KEYREL.CARG = 
have) into fs (type: v_np-vp_aeq-prp_le)
; WARNING: failed to unify new path-value (SYNSEM.LKEYS.KEYREL.CARG = no) 
into fs (type: d_-_no_le)
; WARNING: failed to unify new path-value (SYNSEM.LKEYS.KEYREL.CARG = no) 
into fs (type: av_-_dg-any_le)

Let's call these the type-I warnings.


; WARNING: failed to create dag for new path-value ("Reptiles" = Reptiles)
; WARNING: failed to create dag for new path-value ("Reptiles" = Reptiles)
; WARNING: failed to create dag for new path-value ("Reptiles" = Reptiles)
(1) `reptiles have no fur.' [12288] --- 1 (0.03|0.04s) <22:158> (1913.2K) 
[0.0s]

Let's call these the type-II warnings.


C.J. is getting the same warnings in his SciBorg setup.


In his last message Dan pointed out that what was going on was
that PET tried to unify CARGS into types that didn't have them,
assuming that this was due to the fact that I was using proper
lexical types (rather than generic ones) in my posmapping.

But this is not so. In my posmapping I'm using only the
"generic_..." types, and I've verified, that they hall have a
"SYNSEM.LKEYS.KEYREL.CARG" path with type *top*.

After experimenting with this some more, it now seems to me like
PET is actually trying to unify CARGS into the FS for each word,
*regardless* of whether the word has been instantiated from the
posmapping as a generic type, or whether the word is actually
coming out of the ERG lexicon.

I tried what would happen, if I'd put a CARG of type string
into the "relation" type, thus giving every EP in the whole
grammar a CARG. This made the type-I warnings disappear.
Obviously, this is acceptable not even as an ugly hack, as
the garbage CARGs, that thereby make their way into the
MRSes will confuse any kind of subsequent processing.

Can one of the PET developers check if this is a reasonable
account of what might be happening, and, ideally, fix it?


However, the type-II warnings are still there, and I'm
absolutely clueless as to what's going on here.
Any ideas?


Any help would be much appreciated. (...and I assume this
goes for C.J. as well).


Thanks!

Richard



On Fri, 21 Mar 2008, Dan Flickinger wrote:

>
>>>> ; WARNING: failed to unify new path-value ( = "No") into fs (type:
>>>> d_-_no_le)
>>>> ; WARNING: failed to unify new path-value (SYNSEM.LKEYS.KEYREL.CARG =
>>>> have) into fs (type: v_np-prd_oeq-ntr_le)
>>>
>>> If these messages persist after you've removed the generic-le-suffixes
>>> setting, then maybe someone else can help diagnose this, or maybe
>>> indeed you can just ignore the warnings.
>>
>> I am still getting them.
>
> On reflection, I now see why you're getting those messages.  You are using
> a larger set of generic lexical entries than I usually use, and the clumsy
> mechanism that PET currently uses for building the semantics for an
> unknown word is to simply stamp its orthography into the CARG attribute
> of that entry's EP.  To accommodate this hack (grudgingly), I defined
> distinct lexical types for each of the generic lexical entries, so that
> these types could idiosyncractically license the CARG attribute (which
> ordinarily is only used for proper names, numerals, etc).  Since you are
> using the ordinary lexical types to instantiate your richer set of
> generics, you would have to now duplicate all of these types to make ones
> which allow that CARG attribute, since I'm not willing to have CARG
> floating around in every EP in the ERG's MRSs.  This is clearly undesirable.
>
> Instead, I think it is finally time (for someone) to fix this hack in
> the handling of generic entries in PET, so the PRED value for an unknown
> word is a string which is constructed on the fly from its stem orthography
> and its POS, using the following template:
>
>  "_ORTH_POS_unk_rel"
>
> Then we should get rid of the assignment of a value for CARG except for
> proper names, where the PRED should be "named_rel" and the orthography
> should be the CARG value.
>
> The benefit would be that we could then always use an existing open-class
> lexical type to instantiate an entry for an unknown word, with a
> predictable semantics and a type which would better match what we find
> in our treebanks.
>
> Yi, maybe you have already done something like this?  Or could you at
> least estimate how much work would be involved in making the change?
>
> Thanks,
>
> Dan
>
>
>



More information about the developers mailing list