[developers] Paraphrase grammars and SEMI's

Tue May 26 03:30:30 CEST 2009

One more note, regarding your example of where the internal type hierarchy is useful:  it's clear that we need to improve a method for including the 'right' abstract types in the SEM-I, rather than just those predicates that appear in lexical entries or rules.  We've long discussed the notion of including some annotation in the type definition files that would be read by the SEM-I construction machinery, and perhaps it's time to make this happen.

 Dan

----- Original Message -----
From: "Francis Bond" <fcbond at gmail.com>
To: developers at delph-in.net
Cc: "Darren Scott Appling" <darren.scott.appling at gmail.com>
Sent: Monday, May 25, 2009 2:59:49 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: [developers] Paraphrase grammars and SEMI's

G'day,

   we would like to solicit opinions (especially from Dan) about the
following question:

If you are making a paraphrase grammar, should it use the language
internal MRS, or the post SEM-I external MRS.

That is:

Internal: TEXT - ERG - MRS -　EnEn - MRS - ERG - TEXT
or
External: TEXT -ERG - MRS - MRS' - EnEn - MRS' - MRS - ERG - TEXT

My view is that we should use the external MRS for the following reasons:
(i) the SEM-I is the interface to outside resources, such as WordNet,
which we plan to mine for rules
(ii) the internal MRS is more likely to change without warning
(iii) current MT systems all use the internal rules

However, there are also reasons for choosing the internal MRS
(i) full access to the grammar internals
(ii) the current paraphrasing rules use the internal MRS
(iii) two fewer steps to pass through

Using the external MRS also means that the internal hierarchy is
obscured.  To pick a not at all random example, we want to under
specify B's A and (the) A of B.  This is easy to do in the current ERG
due to two hierarchies:
_the_q_rel := def_q_rel, _def_explicit_q_rel := def_q_rel
_of_q_rel :=of_q_rel, poss_rel := of_q_rel.

By creating an MRS with def_q_rel and of_q_rel, both alternatives (and
more) are generated.
However, this is not obvious from the SEM-I, which only records what
predicates are exposed, and says nothing about their place in the
hierarchy.  It is also not encoded in predicates.erg.tdl, although
perhaps it should be?  Currently (for the erg, rather than terg) it
says:

poss_rel := predsort. (in predicates.tdl)
_of_p_rel := of_p_rel. (in predicates.erg.tdl)
def_explicit_q_rel := quant_rel. (in predicates.erg.tdl)
_the_q_rel := quant_rel. (in predicates.erg.tdl)
def_q_rel := quant_rel. (in predicates.tdl)

So, any thoughts?

-- 
Francis Bond <http://www2.nict.go.jp/x/x161/en/member/bond/>
NICT Language Infrastructure Group