[developers] Punctuation and "-default-les" type mapping in PET/ERG

R. Bergmair rb432 at cam.ac.uk
Wed Apr 16 20:33:12 CEST 2008

Thanks, Peter and CJ, for the hints.

After implementing those changes, I have now got rid of both types of
WARNING messages. It seemed that the line that was responsible was
really the "gMap.carg=deps.content" setting in the default SMAF config.

I've also changed my POS mapping and my smaf configuration, by
basically using C.J.s settings (and throwing out all the obviously
SciBorg-specific stuff).

The "fallback POS mappings" CJ is using in his saf.conf,
of the form "gMap.type='aj_-_i-unk_le'", however, made
the problem reappear.

For the record: If anyone encounters this problem in the future,
I think the minimalistic solution is to change the line

   pos.[] -> edgeType='morph' fallback='' pos=content.tag gMap.carg=deps.content


   pos.[] -> edgeType='morph' fallback='' pos=content.tag

in any smaf configuration based on the old default.

I just saw, that the sample SMAF config in the Wiki, under
http://wiki.delph-in.net/moin/SmafPet already reflects this
(but not the one under http://wiki.delph-in.net/moin/SmafPet).

Perhaps the default configuration used by PET in the case where
SMAF input is used without a smaf-conf setting should also be
changed accordingly?



On Mon, 14 Apr 2008, Christopher Rupp wrote:

> Hi Richard,
> I don't know if this is getting too parochial now, but I'll still address the
> developers list too.
> So you've got a rule for 'pos' annotation in SMAF:
> pos.[] -> edgeType='morph' fallback='' pos=content.tag gMap.carg=deps.content
> which addresses a 'carg' construct, a 'carg' definition which sets a path:
> define gMap.carg (synsem lkeys keyrel carg) STRING
> A SMAF containing 'pos' and 'token' annotations:
> <?xml version='1.0' encoding='UTF-8'?>
> "file:///auto/homes/rb432/workspace/PyRMRS/dtd/smaf.dtd">
> <smaf>
>   <lattice init="v0" final="v4">
>     <edge type="token" id="t1" source="v0" target="v1" cfrom="0" cto="8">
> Reptiles</edge>
>     <edge type="token" id="t2" source="v1" target="v2" cfrom="9" cto="13">
> have</edge>
>     <edge type="token" id="t3" source="v2" target="v3" cfrom="14" cto="16">
> no</edge>
>     <edge type="token" id="t4" source="v3" target="v4" cfrom="17" cto="21">
> fur.</edge>
>     <edge type="pos" id="p0" source="v0" target="v1" cfrom="0" cto="8"
> deps="t1">
>       <slot name="tag">NN2</slot>
>     </edge>
>     <edge type="pos" id="p1" source="v1" target="v2" cfrom="9" cto="13"
> deps="t2">
>       <slot name="tag">VH0</slot>
>     </edge>
>     <edge type="pos" id="p2" source="v2" target="v3" cfrom="14" cto="16"
> deps="t3">
>       <slot name="tag">AT</slot>
>     </edge>
>     <edge type="pos" id="p3" source="v3" target="v4" cfrom="17" cto="21"
> deps="t4">
>       <slot name="tag">NN1</slot>
>     </edge>
>   </lattice>
> </smaf>
> And a partial posmapping which defines types containing the 'carg' path for
> two of
> the four tokens in the input. And the issue is what happens when no posmapping
> is defined for the 'pos' input? (Leaving aside for the moment the
> "Reptiles"/Reptiles mismatch.)
> As the lexicon is probably the only other place information can come from, it
> looks
> like there's a conflict between the SMAF rule and the lexicon content. In this
> case, that's probably what you want, provided the lexical entry is able to
> ignore the input from the SMAF rule. More generally, there may be cases where
> you
> want to merge information into the lexicon entry, in the same way that you want
> to merge the token specific information into the generic lexical entry spawned
> by the posmapping rules that are defined.
> So does this input get an analysis in which at least one of the generic lexical
> entries gets instantiated correctly? (Or is all the SMAF input superceded by
> the lexicon?) I'm just speculating here and I'd much rather not have to. While
> this is a reasonable abductive argument, it ought to be confounded by the fact
> that
> my posmapping is a bit different but still partial, but I was able get get rid
> of
> the warnings.
> I really just want to understand what is going on here.
> Cheers,
> C.J.

More information about the developers mailing list