[developers] Additions to the Simple MRS format

Fri Aug 29 09:41:45 CEST 2014

Hi Dan,

On Fri, Aug 29, 2014 at 4:54 AM, Dan Flickinger <danf at stanford.edu> wrote:
> Hi Mike -
>
>> > We also want to hear how your grammars deal with the TOP variable. In
>> > general, I think, the (actual) LTOP is equated with the top handle of
>> > a local structure, but when a full utterance is produced, the TOP (or
>> > GTOP, in Matrix-derived grammars) is QEQ'd to the handle (i.e. GTOP
>> > qeq LTOP), but this might not be true for all grammars. In particular,
>> > we'd like to know if it's ever necessary to have both TOP *and* LTOP
>> > representable in an MRS.
>
> While I agree that for a full utterance, it should be sufficient for an MRS to provide TOP (or GTOP), this elimination of LTOP in "an MRS" implies that only full utterances can have a corresponding MRS, or that the constituents contained in full utterances have an MRS which is not what is used in semantic composition, since the rules of composition need to have access to the phrase's LTOP.  So on this view, it would seem that we're saying that we don't combine smaller MRSs to compose larger MRSs, but that doesn't sound right.  Maybe we mean to distinguish a full utterance MRS from constituent MRSs, but that distinction does not look so sharp to me, given utterances that are not complete sentences.
>
> Is this merely a question of terminology, of what we decide to call "an MRS"?  I can see that LTOP is only relevant for a sign as long as further semantic composition is expected, and that we might use the term "an MRS" to refer ambiguously either to an object that only has TOP and is not suitable for semantic composition, or to an object with LTOP (and possibly also GTOP) which is available for further composition.  Or maybe we use another term for the semantics of a constituent, and reserve "an MRS" for full utterances.

Sorry, it looks like I was unclear. When I said "in an MRS", I meant
the MRS output by the grammar (e.g. in the 'mrs' field of a [incr
tsdb()] profile), and not the AVM representation. I don't doubt that
LTOP is necessary for composition, and don't propose taking it out of
grammars or renaming it in the grammars. What I was asking is if it's
necessary to have LTOP serialized along with (G)TOP in the SimpleMRS
format, e.g. for later realization using that MRS or something. My
guess is no, since up till now we've only had the (G)TOP, but
inaccurately called it LTOP (except in the cases where people may be
outputting MRSs for partial parses or something).

You may be right that we could use a terminological distinction for
the semantics of a complete sentence and fragments. I had proposed
"semantic sentence", as well as "semantic constituent" for valid
subgraphs, but neither Emily nor Francis really liked the terms. By
valid subgraphs I mean connected, headed subgraphs. E.g. when you have
"the large dog" you can get both "large dog" and "the dog" as semantic
constituents (where the latter isn't a syntactic constituent of the
phrase), but "the large" is invalid here. I'd welcome a better term to
describe these (as well as one for non-sentential, non-constituent
subgraphs... like "semantic fragment" or just "subgraph").

> If you all discussed this, it would be good to know what conclusions you reached, and in any case, some clarification would help.
>
>  Dan
>
> ----- Original Message -----
> From: "Emily M. Bender" <ebender at uw.edu>
> To: "Michael Wayne Goodman" <goodmami at u.washington.edu>
> Cc: "Delph-in developers list" <developers at delph-in.net>
> Sent: Tuesday, August 26, 2014 5:03:38 AM
> Subject: Re: [developers] Additions to the Simple MRS format
>
> Thanks, Mike.  Regarding LTOP and TOP, I can't think of a scenario where
> both would need to be specified.  As for ICONS, that also looks good,
> though the "target relation clause" terminology is rather
> information-structure specific.  If that terminology is needed anywhere, we
> should probably generalize to that it applies equally to e.g. coreference
> constraints.
>
>
>
> On Wed, Aug 20, 2014 at 4:12 AM, Michael Wayne Goodman <
> goodmami at u.washington.edu> wrote:
>
>> I forgot to mention the addition of ICONS. It's already been
>> implemented (in ACE, at least), but it would be useful to define it as
>> part of the new format. It looks much like an HCONS list:
>>
>> ICONS: < ... >
>>
>> And the items on the list take the form:
>>
>> target relation clause
>>
>> ... where target and clause are individual variables (i, e, or x), and
>> don't necessarily need to be linked to EPs in the MRS (i.e. it can be
>> an unbound "i" variable for a dropped argument). The set of relations
>> is not fixed, as with HCONS, but defined by the grammar.
>>
>> For example, in Japanese you may have a question 犬が何をした？ "What did the
>> dog do?", with a response 吠えた "Barked." with a dropped subject. The
>> MRS representing the response with the dropped subject as the topic
>> and the verb as the focus might be:
>>
>> [ TOP: h0
>>   INDEX: e2 [ e TENSE: past MOOD: indicative PROG: - PERF: - ASPECT:
>> default_aspect PASS: - SF: prop ]
>>   RELS: < [ "_hoeru_v_1_rel"<0:2> LBL: h1 ARG0: e2 ARG1: i3 ] >
>>   HCONS: < h0 qeq h1 >
>>   ICONS: < i3 topic e2 e2 focus e2 > ]
>>
>> On Mon, Aug 18, 2014 at 1:42 PM, Michael Wayne Goodman
>> <goodmami at u.washington.edu> wrote:
>> > Hello all,
>> >
>> > It was noted in Tomar that the Simple MRS format lacked some
>> > attributes that are present in the XML format, and that these
>> > attributes can be useful for users of MRS. The attributes are:
>> >
>> >   * A Lnk value (e.g. <cfrom:cto>) for the whole MRS
>> >   * "surface" on the top level of the MRS
>> >   * "surface" on the EPs
>> >
>> > Stephan, Ann, Glenn, Woodley, and myself---developers of software that
>> > produce Simple MRS (forgive me if I've left someone out)---have
>> > discussed how to add these to the format, and we have come up with a
>> > way to represent them matching the aesthetics of the original format
>> > and, more importantly, maintaining backwards compatibility (by making
>> > the additions optional and by not outputting them if the data is not
>> > specified).
>> >
>> > We also agreed to make a (long overdue) change so that "LTOP" becomes
>> > "TOP", since in full utterances the thing currently called LTOP is in
>> > fact TOP (i.e. a global top, rather than local; this is further
>> > discussed at the bottom of this email).
>> >
>> > Finally, we agreed to assign a version number to this updated format
>> > (e.g. v1.1, where the currently used format is v1.0), so that
>> > processors can, in theory, input and output MRSs compliant with either
>> > format.
>> >
>> > While the implementation details were discussed off-list, we want to
>> > bring the discussion to developers at delph-in.net (as we agreed to do in
>> > Tomar), so that others have a chance to see and comment on the
>> > proposal.
>> >
>> > Here is an example MRS in the new format:
>> >
>> > [ <0:41> "I am sure I shall say nothing of the kind."
>> >   TOP: h0
>> >   INDEX: e2 [ e SF: prop TENSE: pres MOOD: indicative PROG: - PERF: - ]
>> >   RELS: < [ pron_rel<0:1> "I" LBL: h4 ARG0: x3 [ x PERS: 1 NUM: sg
>> > PRONTYPE: std_pron ] ]
>> >           [ pronoun_q_rel<0:1> LBL: h5 ARG0: x3 RSTR: h6 BODY: h7 ]
>> >           [ "_sure_a_of_rel"<5:9> "sure" LBL: h1 ARG0: e2 ARG1: x3 ARG2:
>> h8 ]
>> >           [ pron_rel<15:16> "I" LBL: h9 ARG0: x10 [ x PERS: 1 NUM: sg
>> > PRONTYPE: std_pron ] ]
>> >           [ pronoun_q_rel<15:16> LBL: h11 ARG0: x10 RSTR: h12 BODY: h13 ]
>> >           [ "_say_v_1_rel"<23:26> "say" LBL: h14 ARG0: e15 [ e SF:
>> > prop TENSE: fut MOOD: indicative PROG: - PERF: - ] ARG1: x10 ARG2: x16
>> > [ x PERS: 3 NUM: sg ] ]
>> >           [ thing_rel<27:34> "nothing" LBL: h17 ARG0: x16 ]
>> >           [ _no_q_rel<27:34> "nothing" LBL: h18 ARG0: x16 RSTR: h19
>> BODY: h20 ]
>> >           [ _of_p_rel<35:37> "of" LBL: h17 ARG0: e21 [ e SF: prop ]
>> > ARG1: x16 ARG2: x22 [ x PERS: 3 NUM: sg IND: + ] ]
>> >           [ _the_q_rel<38:41> "the" LBL: h23 ARG0: x22 RSTR: h24 BODY:
>> h25 ]
>> >           [ "_kind_n_of-n_rel"<42:47> "kind" LBL: h26 ARG0: x22 ARG1:
>> i27 ] >
>> >   HCONS: < h0 qeq h1 h6 qeq h4 h8 qeq h14 h12 qeq h9 h19 qeq h17 h24 qeq
>> h26 > ]
>> >
>> > (I made up the surface values for illustration, so in practice they
>> > may differ, but the formatting will remain the same.)
>> >
>> > We also want to hear how your grammars deal with the TOP variable. In
>> > general, I think, the (actual) LTOP is equated with the top handle of
>> > a local structure, but when a full utterance is produced, the TOP (or
>> > GTOP, in Matrix-derived grammars) is QEQ'd to the handle (i.e. GTOP
>> > qeq LTOP), but this might not be true for all grammars. In particular,
>> > we'd like to know if it's ever necessary to have both TOP *and* LTOP
>> > representable in an MRS.
>> >
>> > Thanks!
>> >
>> > --
>> > -Michael Wayne Goodman
>>
>>
>>
>> --
>> -Michael Wayne Goodman
>>
>>
>
>
> --
> Emily M. Bender
> Associate Professor
> Department of Linguistics
> Check out CLMS on facebook! http://www.facebook.com/uwclma
>

-- 
-Michael Wayne Goodman