[developers] ICONS and generation

Dan Flickinger danf at stanford.edu
Sat Feb 6 00:40:46 CET 2016


As I understand the Soon and Bender account, an MRS for a sentence should include in the ICONS list at least one element for each individual (eventuality or instance) that is introduced.  In the ERG this would mean that the value of each ARG0 should appear in at least one ICONS entry, where most of these would be of the maximally underspecified type `info-str', but possibly specialized because of syntactic structure or stress/accent or maybe even discourse structure.


I see the virtue of having these overt ICONS elements even when of type `info-str', to enable the fine-grained control that Stephan notes that we want for generation, and also to minimize the differences between the ERG and grammars being built from the Matrix which embody Sanghoun's careful work.


If the grammarian is to get away with not explicitly introducing each of these ICONS elements in the lexical entries, as Sanghoun does in the Matrix, then it would have to be possible to predict and perhaps mechanically add the missing ones after composition was completed.  I used to hope that this would be possible, but now I'm doubtful, leading me to think that there is no good alternative to the complication (maybe I should more kindly use the term `enrichment') of the grammar with the overt introduction of these guys everywhere.  Here's my reasoning:


I assume that what we'll want in an MRS for an ordinary sentence is an ICONS list that has exactly one entry for each pair of an individual `i' and the eventuality which is the ARG0 of each predication in which `i' appears as an argument.  Thus for `the cat persuaded the dog to bark' the ICONS list should have four elements: one for cat/persuade, one for dog/persuade, one for bark/persuade, and one for dog/bark.  Now if I wanted to have the grammar continue to only insert ICONS elements during composition for the non-vanilla info-str phenomena, and fill in the rest afterward, I would have to know not only the arity of each eventuality-predication, but which of its arguments was realized in the sentence, and even worse, which of the realized syntactic arguments corresponded to semantic arguments (so for example not the direct object of `believe').  Maybe I give up too soon here, but this does not seem doable just operating on the MRS resulting from composition, even with access to the SEM-I.


So if the necessary ICONS elements have to be introduced overtly by the lexicon/grammar during composition, then I would still like to explore a middle ground that does not result in the full set of ICONS elements Soon and Bender propose for a sentence.  That is, I wondered whether we could make do with adding to the ERG the necessary introduction of just those ICONS elements that would enable us to draw the distinctions between `unmarked', 'topic', and 'focus' that we were used to exploiting in the days of messages.   But since pretty much any preposition's or adjective's or verb's complement can be extracted, and any verb's subject can be extracted, and most verbs' direct and indirect objects can be passivized, I think we'll still end up with an ICONS entry for each eventuality/argument pair for every predication-introducing verb, adjective, and preposition in a sentence, and maybe also for some nouns as in "who is that picture of?".  This still lets us exclude ICONS elements involving adverbs and maybe also the arguments of conjunctions, subordinators, modals.  If we went this route, I think it would be possible to make modest additions to certain of the constructions, and not have to meddle with lexical types, to get these ICONS elements into the MRS during composition.


Such a partial approach does not have the purity of Soon and Bender's account, but might be more practical, at least as a first step, for the ERG.  It would at least enable what I think is a more consistent interpretation of the ICONS elements for generation, and should give us the fine-grained control I agree that we want.  Thus to get the generator to produce all variants from an MRS produced by parsing a simple declarative, one would have to remove the info-str ICONS element whose presence excludes the specialization to focus or topic because of our friend Skolem.


Counsel?


 Dan

________________________________
From: developers-bounces at emmtee.net <developers-bounces at emmtee.net> on behalf of Ann Copestake <aac10 at cam.ac.uk>
Sent: Friday, February 5, 2016 1:43 PM
To: Emily M. Bender; Stephan Oepen
Cc: developers; Ann Copestake
Subject: Re: [developers] ICONS and generation

Thanks!

On 05/02/2016 21:30, Emily M. Bender wrote:
Not sure if this answers the question, but a couple of comments:

(a) I do think that written English is largely underspecified for information structure.
It's part of what makes good writing good that the information structure is made apparent
somehow.


OK.  should I understand you as saying that composition (as in, what we do in the grammars) leaves it mostly underspecified, but that discourse level factors make it apparent?  or that it really is underspecified?

(b) I think the "I want only the unmarked form back" case might be handled by either
a setting which says "no ICONS beyond what as in the input" (i.e. your ICONS { }) or
a pre-processing/generation fix-up rule that takes ICONS { ... } and outputs something
that would be incompatible with anything but the unmarked form.  Or maybe the
subsumption check goes the wrong way for this one?

Yes, I think the ICONS {} might be a possible way of thinking about it.  I should make it clear - I don't think there's a problem with constructing an implementation that produces the `right' behaviour but I would much prefer that the behaviour is specifiable cleanly in the formalism rather than as another parameter to the generator or whatever.

I hope Sanghoun has something to add here!

Emily

On Fri, Feb 5, 2016 at 1:01 PM, Stephan Oepen <oe at ifi.uio.no<mailto:oe at ifi.uio.no>> wrote:
colleagues,

my ideal would be a set-up where the provider of generator inputs has three options: (a) request topicalization (or similar), (b) disallow it, or (c) underspecify and get both variants.

we used to have that level of control (and flexibility) in the LOGON days where there were still messages: in the message EPs, there were two optional ‘pseudo’ roles (TPC and PSV) to control topicalization or passivization of a specific instance variable.  effectively, when present, these established a binary relation between the clause and one of its nominal constituents.  if i recall correctly, blocking topicalization was accomplished by putting an otherwise unbound ‘anti’-variable into the TPC or PSV roles.

could one imagine something similar in the ICONS realm, and if so, which form would it have to take?

best wishes, oe


On Friday, February 5, 2016, Woodley Packard <sweaglesw at sweaglesw.org<mailto:sweaglesw at sweaglesw.org>> wrote:
I can confirm that under ACE, behavior is what you indicate, i.e. generating from parsing the topicalized feline-canine-playtime I get just the topicalized variant out, but when generating from parsing the ordinary word order I get all 5 variants out.

I believe this was designed to imitate the long-standing condition that the MRS of generation results must be subsumed by the input MRS.  The observed behavior seems to me to be the correct interpretation of the subsumption relation with ICONS involved.  Note that an MRS with an extra intersective modifier would also be subsumed, for example, but such MRS are never actually generated since those modifier lexical entries never make it into the chart.

It’s certainly reasonable to ask whether (this notion of) subsumption is really the right test.  I’ve met lots of folks who prefer to turn that subsumption test off entirely.  I guess it’s also possible that the subsumption test is right for the RELS portion of the MRS but not for the ICONS, though that seems a bit odd to consider.  However, given that we don’t have many ideas about truth-conditional implications of ICONS, maybe not so odd.

I don’t really have much to offer in terms of opinions about what the right behavior should be.  I (believe I) just implemented what others asked for a couple years ago :-)

-Woodley

> On Feb 5, 2016, at 8:03 AM, Ann Copestake <aac10 at cl.cam.ac.uk> wrote:
>
> I'm part way through getting ICONS support working in Lisp, testing on the version of the ERG available as trunk. I have a question about generation.  If I implemented the behaviour described in http://moin.delph-in.net/IconsSpecs there doesn't seem to be a way of specifying that I want a `normal' ordering for English.
>
> e.g., if I take the MRS resulting from
>
> that dog, the cat chased.
>
> without ICONS check, there are 5 realizations, including the `null ICONS' case `The cat chased that dog.'  With an exact ICONS check, I can select realizations with the same ICONS (modulo order of ICONS elements, of course, in the case where there's more than one element).  But with the http://moin.delph-in.net/IconsSpecs behaviour, there's no way of specifying I want a `normal' order - if I don't give an ICONS, I will always get the 5 realisations. In fact, as I understand it, I can always end up with more icons in the realisation than in the input, as long as I can match the ones in the input.
>
> So:
> - is the IConsSpec behaviour what is desired for the ERG (e.g., because one can rely on the realisation ranking to prefer the most `normal' order)?
> - or does the ERG behave differently from Emily and Sanghoun's grammars, such that different generator behaviour is desirable? and if so, could we change things so we don't need different behaviours
>
> Ann
>
>
>





--
Emily M. Bender
Professor, Department of Linguistics
Check out CLMS on facebook! http://www.facebook.com/uwclma

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20160205/1f38d460/attachment.html>


More information about the developers mailing list