[developers] MRSs in [incr tsdb()] and unknown words

Francisco Costa fcosta at di.fc.ul.pt
Thu Dec 4 18:14:52 CET 2008


We have a problem with MRS representations in [incr tsdb()] that involve
sentences parsed with PET and with unknown words.

We're parsing a corpus with PET, using the PET Input Chart format in
order to support unknown words and to constrain the grammatical category
of known words, based on manual annotations to the corpus. We use the
-default-les and the -tsdbdump options for PET. We put the files that
are generated in a test suite, and we annotate it with [incr tsdb()].

However, in [incr tsdb()], the MRSs show a `nil' for unknown words
instead of their relation name. This is also what we get when we export
a test suite from [incr tsdb()].

In the PIC XML, we explicitly constrain the features where these values
appear in the grammar, using the <fsmod> element. When we run PET with
PIC input and MRS output (-mrs option), it works fine.

Do you know what is causing this issue and how we could fix it?

Thank you very much in advance,


Francisco Costa

More information about the developers mailing list