[developers] Malformed RMRS XML output from an ugly but valid PIC:

Ann Copestake Ann.Copestake at cl.cam.ac.uk
Tue Nov 17 23:59:00 CET 2009

the unknown word code is adding the leading _ and the final _rel
presumably?  Really it should be stripping all underscores in the
string it's turning into a pred.  The code is based on the assumption
we have _lemma_postag[_sense]_rel where lemma, postag and sense
contain no underscores.  I'm reasonably certain that we made the ERG
hand-built entries conform to that convention.  Anyway, I checked with
the ERG demo (for lack of another option right this second) and the
unknown word mechanism isn't removing underscores from the lemma
portion apparently, so it looks to me as though we're generating
incorrect pred names systematically.  We can change the convention of
course, but we have to agree some convention.  



> > yes - for this case the RMRS conversion code could (and probably
> > should) check that the thing after the underscore is a valid pos tag,
> > given that rmrs.dtd enumerates these.  I don't know whether the
> > grammars conform in terms of ensuring that the pos label is limited
> > to those in rmrs.dtd, but I guess we'd soon find out ...
> i was actually assuming the lack of an initial underscore indicated we
> should not attempt decomposing the PRED value, i.e. there is no syntax
> convention imposed on `grammar' predicates?  i realize this should not
> be a grammar predicate, in principle; but it should not be a predicate
> in the first place, after all ...
> > Longer term, the split should be done as part of the conversion from
> > TFS to MRS (rather than in the MRS->RMRS conversion).  In fact,
> > ideally, the structure should be there in the grammars themselves,
> yes, i agree that would be a desirable upgrade.  besides the core MRS
> structures (and input output routines), i expect the generator, tests
> for equality or subsumption, and transfer engine would be affected by
> this change.  if we coordinate a little beforehand, i could take care
> of the latter two, which i believe should be doable in a day or two.
>                                                        cheers  -  oe
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
> +++    --- oe at ifi.uio.no; stephan at oepen.net; http://www.emmtee.net/oe/ ---
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

More information about the developers mailing list