[developers] Questions about chart mapping

John Carroll J.A.Carroll at sussex.ac.uk
Thu Jun 4 19:09:48 CEST 2020


Hi developers,

I've started to look at chart mapping and how it might be implemented. I've been reading the following:

'Tutorial - Chart Mapping in PET' at DELPH-IN Summit 2009 http://www.delph-in.net/2009/cm.pdf
LREC 2008 paper http://www.lrec-conf.org/proceedings/lrec2008/pdf/349_paper.pdf

I've also been checking my understanding of the formalism by looking at the token mapping rules in the ERG 2018 directory tmr/. I have a few questions below which I've tried to contextualise with respect to the tutorial slides. I hope an expert can answer them.

> Copying Information
> 
> * reentrancies can be used to copy information from INPUT to OUTPUT

Presumably reentrancies can also be used to copy information from CONTEXT to OUTPUT?

> Chart Mapping Procedure
> 
> * a rule match is completed if all CONTEXT and INPUT arguments are bound

What happens if there are several ways of matching chart edges to CONTEXT in a rule? Is the rule applied repeatedly, once for each alternative match? Or is only one of the alternative matches considered? This could matter if feature values or regular expression captures are copied from the context to the output.

> * each rule is applied until its fixpoint is reached

If I've understood the formalism correctly, I can imagine a rule that doesn't ever reach a fixpoint for some inputs (e.g. a rule in which the input and output unify, with the output building structure). Is the intended interpretation the following: a rule is never applied more than once to the same combination of input and context edges? And it's up to the grammarian to avoid writing infinitely looping rules?

If this is the correct interpretation, then I'm puzzled by a few rules in the ERG: bridge_tmr in tmr/bridge.tdl, and the four rules default_(ld|lb|rd|rb)_tmr in tmr/gml.tdl. Their inputs seem to unify with their outputs, so surely each would apply in an infinite loop (i.e. an input edge would match and be replaced with a new output edge, and since this new edge had not previously been used as an input the rule would pick this up and apply again, etc etc)?

Aside from the fixpoint issue, I'm not sure I understand the purpose of the rules default_(ld|lb|rd|rb)_tmr. At first glance they seem to merely replace their input. Is their purpose to remove all features that are not specified on the input side?

I'm also puzzled by the following comment on bridge_tmr:

> ;; ...  here, we take advantage of redundancy detection built into
> ;; token mapping, i.e. even though the rule is written as if it could apply any
> ;; number of times per cell, there shall not be duplicates in the token chart.

What enforces the restriction that "there shall not be duplicates in the token chart"? I can't see any mention of redundancy detection or of this restriction in the paper or tutorial slides. Is the restriction somehow enforced by the fixpoint condition?

Thanks in advance for clarification on these points.

John




More information about the developers mailing list