[developers] ICONS and generation

Sanghoun Song sanghoun at gmail.com
Sat Feb 6 18:59:09 CET 2016


My apologies for my really late reply!

I am not sure whether I fully understand your discussion, but I would like
to leave several my ideas on using ICONS for generation.

First, in my analysis (final version), only expressions that contribute to
information structure introduce an ICONS element into the list. For
example, the following unmarked sentence (a below) has no ICONS element
(i.e. empty ICONS).

a. plain: Kim chases the dog.
b. passivization: The dog is chased by Kim.
c. fronting: The dog Kim chases.
d. clefting: It is the dog that Kim chases.

Using  the type hierarchy for information structure in my thesis, I can say
the followings

(i) The subject Kim and the object the dog in a plain active sentence (a)
are in situ. They may or may not be focused depending on which constituent
bears a specific accent, but in the sentence-based processing their
information structure values had better remain underspecified for flexible
representation.

(ii) The promoted argument the dog in the passive sentence (b) is evaluated
as conveying focus-or-topic, while the demoted argument Kim is associated
with non-topic.

(iii) In (c), the fronted object the dog is assumed to be assigned
focus-or-topic in that the sentence conveys a meaning of either "As for the
dog, Kim chases it". or (d), while the subject in situ is evaluated as
containing neither topic nor focus (i.e. background). (Background may not
be implemented in the ERG, I think.)

(iv) The focused NP in (d) carries focus, and the subject in the cleft
clause Kim is also associated with bg.

Thus, we can create a focus specification hierarchy amongst (a-d) as
[clefting > fronting > passivization > plain].

What I want to say is that a set of sentences which share some properties
may have subtle shades of meaning depending on how focus is assigned to the
sentences. Paraphrasing is made only in the direction from the right to the
left of [clefting > fronting > passivization > plain], because paraphrasing
in the opposite direction necessarily causes loss of information. For
example, a plain sentence such as (a) can be paraphrased into a cleft
construction such as (d), but not vice versa.

In a nutshell, a more specific sentence might not better to be paraphrased
into a less specific sentence in terms of information structure.

Second, I provided many dependency graphs in my thesis. The main reason was
that nobody outside of the DELPH-IN can fully understands the complex
co-indexation in ICONS/MRS. At that time, I didn't work on DMRS with
respect to ICONS. If there is a way to represent ICONS in DMRS (direct from
TFS or via MRS), I am interested in the formalism.


Sanghoun


On Sat, Feb 6, 2016 at 1:26 AM, Ann Copestake <aac10 at cam.ac.uk> wrote:

> Briefly (more this evening maybe) - I don't see a particular problem with
> filling in the ICONS since what you describe are relationships that are
> overt in the *MRS anyway, aren't they?  I thought, in fact, that these are
> pretty clear from the DMRS graph - which is why Sanghoun uses it to
> describe what's going on.
>
> I believe we can build the DMRS graph direct from the TFS, incidentally -
> don't need to go via MRS ...
>
> Cheers,
>
> Ann
>
>
> On 05/02/2016 23:40, Dan Flickinger wrote:
>
> As I understand the Soon and Bender account, an MRS for a sentence should
> include in the ICONS list at least one element for each individual
> (eventuality or instance) that is introduced.  In the ERG this would mean
> that the value of each ARG0 should appear in at least one ICONS entry,
> where most of these would be of the maximally underspecified type
> `info-str', but possibly specialized because of syntactic structure or
> stress/accent or maybe even discourse structure.
>
>
> I see the virtue of having these overt ICONS elements even when of type
> `info-str', to enable the fine-grained control that Stephan notes that we
> want for generation, and also to minimize the differences between the ERG
> and grammars being built from the Matrix which embody Sanghoun's careful
> work.
>
>
> If the grammarian is to get away with not explicitly introducing each of
> these ICONS elements in the lexical entries, as Sanghoun does in the
> Matrix, then it would have to be possible to predict and perhaps
> mechanically add the missing ones after composition was completed.  I used
> to hope that this would be possible, but now I'm doubtful, leading me to
> think that there is no good alternative to the complication (maybe I should
> more kindly use the term `enrichment') of the grammar with the overt
> introduction of these guys everywhere.  Here's my reasoning:
>
>
> I assume that what we'll want in an MRS for an ordinary sentence is an
> ICONS list that has exactly one entry for each pair of an individual `i'
> and the eventuality which is the ARG0 of each predication in which `i'
> appears as an argument.  Thus for `the cat persuaded the dog to bark' the
> ICONS list should have four elements: one for cat/persuade, one for
> dog/persuade, one for bark/persuade, and one for dog/bark.  Now if I wanted
> to have the grammar continue to only insert ICONS elements during
> composition for the non-vanilla info-str phenomena, and fill in the rest
> afterward, I would have to know not only the arity of each
> eventuality-predication, but which of its arguments was realized in the
> sentence, and even worse, which of the realized syntactic arguments
> corresponded to semantic arguments (so for example not the direct object of
> `believe').  Maybe I give up too soon here, but this does not seem doable
> just operating on the MRS resulting from composition, even with access to
> the SEM-I.
>
>
> So if the necessary ICONS elements have to be introduced overtly by the
> lexicon/grammar during composition, then I would still like to explore a
> middle ground that does not result in the full set of ICONS elements Soon
> and Bender propose for a sentence.  That is, I wondered whether we could
> make do with adding to the ERG the necessary introduction of just those
> ICONS elements that would enable us to draw the distinctions between
> `unmarked', 'topic', and 'focus' that we were used to exploiting in the
> days of messages.   But since pretty much any preposition's or adjective's
> or verb's complement can be extracted, and any verb's subject can be
> extracted, and most verbs' direct and indirect objects can be passivized, I
> think we'll still end up with an ICONS entry for each eventuality/argument
> pair for every predication-introducing verb, adjective, and preposition in
> a sentence, and maybe also for some nouns as in "who is that picture of?".
> This still lets us exclude ICONS elements involving adverbs and maybe also
> the arguments of conjunctions, subordinators, modals.  If we went this
> route, I think it would be possible to make modest additions to certain of
> the constructions, and not have to meddle with lexical types, to get these
> ICONS elements into the MRS during composition.
>
>
> Such a partial approach does not have the purity of Soon and Bender's
> account, but might be more practical, at least as a first step, for the
> ERG.  It would at least enable what I think is a more consistent
> interpretation of the ICONS elements for generation, and should give us the
> fine-grained control I agree that we want.  Thus to get the generator to
> produce all variants from an MRS produced by parsing a simple declarative,
> one would have to remove the info-str ICONS element whose presence excludes
> the specialization to focus or topic because of our friend Skolem.
>
>
> Counsel?
>
>
>  Dan
>
> ------------------------------
> *From:* developers-bounces at emmtee.net <developers-bounces at emmtee.net>
> <developers-bounces at emmtee.net> on behalf of Ann Copestake
> <aac10 at cam.ac.uk> <aac10 at cam.ac.uk>
> *Sent:* Friday, February 5, 2016 1:43 PM
> *To:* Emily M. Bender; Stephan Oepen
> *Cc:* developers; Ann Copestake
> *Subject:* Re: [developers] ICONS and generation
>
> Thanks!
>
> On 05/02/2016 21:30, Emily M. Bender wrote:
>
> Not sure if this answers the question, but a couple of comments:
>
> (a) I do think that written English is largely underspecified for
> information structure.
> It's part of what makes good writing good that the information structure
> is made apparent
> somehow.
>
>
> OK.  should I understand you as saying that composition (as in, what we do
> in the grammars) leaves it mostly underspecified, but that discourse level
> factors make it apparent?  or that it really is underspecified?
>
> (b) I think the "I want only the unmarked form back" case might be handled
> by either
> a setting which says "no ICONS beyond what as in the input" (i.e. your
> ICONS { }) or
> a pre-processing/generation fix-up rule that takes ICONS { ... } and
> outputs something
> that would be incompatible with anything but the unmarked form.  Or maybe
> the
> subsumption check goes the wrong way for this one?
>
> Yes, I think the ICONS {} might be a possible way of thinking about it.  I
> should make it clear - I don't think there's a problem with constructing an
> implementation that produces the `right' behaviour but I would much prefer
> that the behaviour is specifiable cleanly in the formalism rather than as
> another parameter to the generator or whatever.
>
> I hope Sanghoun has something to add here!
>
> Emily
>
> On Fri, Feb 5, 2016 at 1:01 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
>
>> colleagues,
>>
>> my ideal would be a set-up where the provider of generator inputs has
>> three options: (a) request topicalization (or similar), (b) disallow it, or
>> (c) underspecify and get both variants.
>>
>> we used to have that level of control (and flexibility) in the LOGON days
>> where there were still messages: in the message EPs, there were two
>> optional ‘pseudo’ roles (TPC and PSV) to control topicalization or
>> passivization of a specific instance variable.  effectively, when
>> present, these established a binary relation between the clause and one of
>> its nominal constituents.  if i recall correctly, blocking topicalization
>> was accomplished by putting an otherwise unbound ‘anti’-variable into the
>> TPC or PSV roles.
>>
>> could one imagine something similar in the ICONS realm, and if so, which
>> form would it have to take?
>>
>> best wishes, oe
>>
>>
>> On Friday, February 5, 2016, Woodley Packard <sweaglesw at sweaglesw.org>
>> wrote:
>>
>>> I can confirm that under ACE, behavior is what you indicate, i.e.
>>> generating from parsing the topicalized feline-canine-playtime I get just
>>> the topicalized variant out, but when generating from parsing the ordinary
>>> word order I get all 5 variants out.
>>>
>>> I believe this was designed to imitate the long-standing condition that
>>> the MRS of generation results must be subsumed by the input MRS.  The
>>> observed behavior seems to me to be the correct interpretation of the
>>> subsumption relation with ICONS involved.  Note that an MRS with an extra
>>> intersective modifier would also be subsumed, for example, but such MRS are
>>> never actually generated since those modifier lexical entries never make it
>>> into the chart.
>>>
>>> It’s certainly reasonable to ask whether (this notion of) subsumption is
>>> really the right test.  I’ve met lots of folks who prefer to turn that
>>> subsumption test off entirely.  I guess it’s also possible that the
>>> subsumption test is right for the RELS portion of the MRS but not for the
>>> ICONS, though that seems a bit odd to consider.  However, given that we
>>> don’t have many ideas about truth-conditional implications of ICONS, maybe
>>> not so odd.
>>>
>>> I don’t really have much to offer in terms of opinions about what the
>>> right behavior should be.  I (believe I) just implemented what others asked
>>> for a couple years ago :-)
>>>
>>> -Woodley
>>>
>>> > On Feb 5, 2016, at 8:03 AM, Ann Copestake <aac10 at cl.cam.ac.uk> wrote:
>>> >
>>> > I'm part way through getting ICONS support working in Lisp, testing on
>>> the version of the ERG available as trunk. I have a question about
>>> generation.  If I implemented the behaviour described in
>>> http://moin.delph-in.net/IconsSpecs there doesn't seem to be a way of
>>> specifying that I want a `normal' ordering for English.
>>> >
>>> > e.g., if I take the MRS resulting from
>>> >
>>> > that dog, the cat chased.
>>> >
>>> > without ICONS check, there are 5 realizations, including the `null
>>> ICONS' case `The cat chased that dog.'  With an exact ICONS check, I can
>>> select realizations with the same ICONS (modulo order of ICONS elements, of
>>> course, in the case where there's more than one element).  But with the
>>> <http://moin.delph-in.net/IconsSpecs>http://moin.delph-in.net/IconsSpecs
>>> behaviour, there's no way of specifying I want a `normal' order - if I
>>> don't give an ICONS, I will always get the 5 realisations. In fact, as I
>>> understand it, I can always end up with more icons in the realisation than
>>> in the input, as long as I can match the ones in the input.
>>> >
>>> > So:
>>> > - is the IConsSpec behaviour what is desired for the ERG (e.g.,
>>> because one can rely on the realisation ranking to prefer the most `normal'
>>> order)?
>>> > - or does the ERG behave differently from Emily and Sanghoun's
>>> grammars, such that different generator behaviour is desirable? and if so,
>>> could we change things so we don't need different behaviours
>>> >
>>> > Ann
>>> >
>>> >
>>> >
>>>
>>>
>>>
>
>
> --
> Emily M. Bender
> Professor, Department of Linguistics
> Check out CLMS on facebook! <http://www.facebook.com/uwclma>
> http://www.facebook.com/uwclma
>
>
>
>


-- 
=================================
Sanghoun Song
Assistant Professor
Dept. of English Language and Literature
Incheon National University
http://corpus.mireene.com
phone: +82-32-835-8129 (office)
=================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20160206/64a8fd74/attachment-0001.html>


More information about the developers mailing list