[developers] Malformed RMRS XML output from an ugly but valid PIC:

Ann Copestake Ann.Copestake at cl.cam.ac.uk
Tue Nov 17 18:05:05 CET 2009


yes - for this case the RMRS conversion code could (and probably should) check 
that the thing after the underscore is a valid pos tag, given that rmrs.dtd 
enumerates these.  I don't know whether the grammars conform in terms of 
ensuring that the pos label is limited to those in rmrs.dtd, but I guess we'd 
soon find out ...  Ideally, I'd do this by parsing the rmrs.dtd file to 
extract valid pos tags rather than have to keep the Lisp code in sync with 
rmrs.dtd manually. The XML output doesn't call the escaping code on the pos 
tag - which would be OK if we'd previously checked it was a valid pos tag.

Longer term, the split should be done as part of the conversion from TFS to 
MRS (rather than in the MRS->RMRS conversion).  In fact, ideally, the 
structure should be there in the grammars themselves, which would allow it to 
be exploited in lexical operations.  e.g., we could have a systematic 
relationship between inchoative and causative forms of verbs.

Ann





More information about the developers mailing list