[developers] Extracting surface form of tokens from derivation trees
Ann Copestake
Ann.Copestake at cl.cam.ac.uk
Thu Apr 24 21:34:28 CEST 2014
sweaglesw at sweaglesw.org said:
> he MRS CFROM/CTO come straight from the +FROM and +TO properties of the
> post-token-mapping tokens dominated by the edge each EP is introduced on.
> Unfortunately they do not uniquely identify such a token; for example:
> We admired the sky-blue water.
> This yields a 'sky-' token and a 'blue' token, both with identical +FROM and
> +TO, and correspondingly a _sky_n_1_rel EP and a _blue_a_1_rel EP, both with
> identical CFROM/CTO. The span "sky-blue" is considered a single token
> before token-mapping (e.g. as the input to TNT), so the answer to your
> question (b) is yes. I don't see that this is a problem from the point of
> view of (a), if what you want is a correspondence between EPs and TNT-level
> tokens, since the EPs still point to the input token spans in this case.
ok, so if we were generating some form of tfrom tto, we'd get
_sky_n_1_rel 4:4 _blue_a_1_rel 4:4
and:
The water was sky-blue.
would yield:
_sky_n_1_rel 4:5 _blue_a_1_rel 4:5
I think this might imply some special treatment would be required, but I'm not
sure.
Thanks,
Ann
More information about the developers
mailing list