[developers] ICONS and generation

Ann Copestake aac10 at cl.cam.ac.uk
Thu Feb 18 19:23:05 CET 2016


Thanks - would anyone else like to comment?  I'm currently not sure 
whether simply enriching ICONS (and possibly HCONS) with a ... notation 
would be enough to give us what's needed in terms of behaviour but I 
suspect that needs to be part of the solution.

Ann

On 18/02/16 00:26, Emily M. Bender wrote:
> Just a quick and belatedly reply to say that from where I sit your 
> analysis
> of the situation makes a lot of sense.
>
> Emily
>
> On Sun, Feb 7, 2016 at 2:12 AM, Ann Copestake <aac10 at cam.ac.uk 
> <mailto:aac10 at cam.ac.uk>> wrote:
>
>     Thanks! and thanks all!  I've come to a view on this which I think
>     is consistent with what everyone has been saying.
>
>     First of all, note that in the MRS syntax, we do not distinguish
>     between terminated and non-terminated lists/bags.  If we think
>     about it from the perspective of typed feature structures, it is
>     clear that there is a distinction - for instance a type `list' is
>     the most general type of list, and the type `e-list' (empty list)
>     is usually a maximally specific type.    Coming back to the
>     notation I used in an earlier message, there is a distinction
>     between { ... } (analogous to list in a TFS) and {} (cf e-list).
>
>     Now, there are two possible interpretations of ICONS as it arises
>     from a DELPH-IN grammar (i.e., as it is output after parsing):
>     1. information structure
>     2. information structure as it arises from morphosyntax
>
>     In the `normal' sentences of the `Kim chased the dog' type, no
>     information structure elements arise from morphosyntax.  We can,
>     however, expect that various contexts (e.g., discourse) give rise
>     to information structure in association with such a sentence. 
>     Hence, with respect to interpretation 1, ICONS is not strictly
>     empty but underspecified (and similarly a one element ICONS may be
>     underspecified with respect to a two-element ICONS and so on).  I
>     think this is consistent with what Sanghoun and Emily are saying. 
>     Under this interpretation, an MRS with no specification of ICONS
>     should indeed generate all the variants sentences we've been
>     discussing.  And so on.
>
>     However, with respect to interpretation 2, the ICONS emerging from
>     the parse of such a sentence is terminated. Once we've finished
>     parsing, we're guaranteeing no more ICONS elements will arise from
>     morphosyntax, whatever someone does in discourse.  Under this
>     interpretation, if I say I want to generate a sentence with an
>     empty ICONS, I mean I want to generate a sentence with no ICONS
>     contribution from morphosyntax.  This is also a legitimate use of
>     the realiser, considered as a stand-alone module.
>
>     Since ICONS is something which I have always thought of as on the
>     boundary of morphosyntax and discourse, I want to be able to
>     enrich ICONS emerging from parsing with discourse processing, so
>     interpretation 1 makes complete sense.  However, I believe it is
>     also perfectly legitimate to be able to divide the world into what
>     the grammar can be expected to do and what it can't, and that is
>     consistent with interpretation 2.
>
>     As a hypothetical move, consider an additional classification of
>     ICONS elements according to whether or not they arise from
>     morphosyntax.  Then we can see that a single ICONS value could
>     encompass both interpretations. i.e., what would arise from a
>     parse would be a terminated list of morphosyntactic-ICONS elements
>     but the ICONS as a whole could be non-terminated.
>
>     I think there may be reasons to be able to distinguish ICONS
>     elements according to whether they are intended as grammar-derived
>     or not, though I do see this might look messy.  But anyway, I want
>     to first check that everyone agrees with this analysis of the
>     situation before trying to work out what we might do about it in
>     terms of notation.
>
>     Incidentally - re Dan's message - my overly brief comment about
>     Sanghoun's use of DMRS earlier was intended to point out that if
>     DMRS had the necessary links for demonstrating ICONS, then in
>     principle this was something we know how to extract.  But right
>     now, I'm not clear whether or not we do need all the
>     underspecified elements, and that's something I would like Dan to
>     comment on before we go further.
>
>     All best,
>
>     Ann
>
>
>
>     On 06/02/2016 17:59, Sanghoun Song wrote:
>>     My apologies for my really late reply!
>>
>>     I am not sure whether I fully understand your discussion, but I
>>     would like to leave several my ideas on using ICONS for generation.
>>
>>     First, in my analysis (final version), only expressions that
>>     contribute to information structure introduce an ICONS element
>>     into the list. For example, the following unmarked sentence (a
>>     below) has no ICONS element (i.e. empty ICONS).
>>
>>     a. plain: Kim chases the dog.
>>     b. passivization: The dog is chased by Kim.
>>     c. fronting: The dog Kim chases.
>>     d. clefting: It is the dog that Kim chases.
>>
>>     Using  the type hierarchy for information structure in my thesis,
>>     I can say the followings
>>
>>     (i) The subject Kim and the object the dog in a plain active
>>     sentence (a) are in situ. They may or may not be focused
>>     depending on which constituent bears a specific accent, but in
>>     the sentence-based processing their information structure values
>>     had better remain underspecified for flexible representation.
>>
>>     (ii) The promoted argument the dog in the passive sentence (b) is
>>     evaluated as conveying focus-or-topic, while the demoted argument
>>     Kim is associated with non-topic.
>>
>>     (iii) In (c), the fronted object the dog is assumed to be
>>     assigned focus-or-topic in that the sentence conveys a meaning of
>>     either "As for the dog, Kim chases it". or (d), while the subject
>>     in situ is evaluated as containing neither topic nor focus (i.e.
>>     background). (Background may not be implemented in the ERG, I think.)
>>
>>     (iv) The focused NP in (d) carries focus, and the subject in the
>>     cleft clause Kim is also associated with bg.
>>
>>     Thus, we can create a focus specification hierarchy amongst (a-d)
>>     as [clefting > fronting > passivization > plain].
>>
>>     What I want to say is that a set of sentences which share some
>>     properties may have subtle shades of meaning depending on how
>>     focus is assigned to the sentences. Paraphrasing is made only in
>>     the direction from the right to the left of [clefting > fronting
>>     > passivization > plain], because paraphrasing in the opposite
>>     direction necessarily causes loss of information. For example, a
>>     plain sentence such as (a) can be paraphrased into a cleft
>>     construction such as (d), but not vice versa.
>>
>>     In a nutshell, a more specific sentence might not better to be
>>     paraphrased into a less specific sentence in terms of information
>>     structure.
>>
>>     Second, I provided many dependency graphs in my thesis. The main
>>     reason was that nobody outside of the DELPH-IN can fully
>>     understands the complex co-indexation in ICONS/MRS. At that time,
>>     I didn't work on DMRS with respect to ICONS. If there is a way to
>>     represent ICONS in DMRS (direct from TFS or via MRS), I am
>>     interested in the formalism.
>>
>>
>>     Sanghoun
>>
>>
>>     On Sat, Feb 6, 2016 at 1:26 AM, Ann Copestake <aac10 at cam.ac.uk
>>     <mailto:aac10 at cam.ac.uk>> wrote:
>>
>>         Briefly (more this evening maybe) - I don't see a particular
>>         problem with filling in the ICONS since what you describe are
>>         relationships that are overt in the *MRS anyway, aren't
>>         they?  I thought, in fact, that these are pretty clear from
>>         the DMRS graph - which is why Sanghoun uses it to describe
>>         what's going on.
>>
>>         I believe we can build the DMRS graph direct from the TFS,
>>         incidentally - don't need to go via MRS ...
>>
>>         Cheers,
>>
>>         Ann
>>
>>
>>         On 05/02/2016 23:40, Dan Flickinger wrote:
>>>
>>>         As I understand the Soon and Bender account, an MRS for a
>>>         sentence should include in the ICONS list at least one
>>>         element for each individual (eventuality or instance) that
>>>         is introduced. In the ERG this would mean that the value of
>>>         each ARG0 should appear in at least one ICONS entry, where
>>>         most of these would be of the maximally underspecified type
>>>         `info-str', but possibly specialized because of syntactic
>>>         structure or stress/accent or maybe even discourse structure.
>>>
>>>
>>>         I see the virtue of having these overt ICONS elements even
>>>         when of type `info-str', to enable the fine-grained control
>>>         that Stephan notes that we want for generation, and also to
>>>         minimize the differences between the ERG and grammars being
>>>         built from the Matrix which embody Sanghoun's careful work.
>>>
>>>
>>>         If the grammarian is to get away with not explicitly
>>>         introducing each of these ICONS elements in the lexical
>>>         entries, as Sanghoun does in the Matrix, then it would have
>>>         to be possible to predict and perhaps mechanically add the
>>>         missing ones after composition was completed.  I used to
>>>         hope that this would be possible, but now I'm doubtful,
>>>         leading me to think that there is no good alternative to the
>>>         complication (maybe I should more kindly use the term
>>>         `enrichment') of the grammar with the overt introduction of
>>>         these guys everywhere.  Here's my reasoning:
>>>
>>>
>>>         I assume that what we'll want in an MRS for an ordinary
>>>         sentence is an ICONS list that has exactly one entry for
>>>         each pair of an individual `i' and the eventuality which is
>>>         the ARG0 of each predication in which `i' appears as an
>>>         argument.  Thus for `the cat persuaded the dog to bark' the
>>>         ICONS list should have four elements: one for cat/persuade,
>>>         one for dog/persuade, one for bark/persuade, and one for
>>>         dog/bark.  Now if I wanted to have the grammar continue to
>>>         only insert ICONS elements during composition for the
>>>         non-vanilla info-str phenomena, and fill in the rest
>>>         afterward, I would have to know not only the arity of each
>>>         eventuality-predication, but which of its arguments was
>>>         realized in the sentence, and even worse, which of the
>>>         realized syntactic arguments corresponded to semantic
>>>         arguments (so for example not the direct object of
>>>         `believe'). Maybe I give up too soon here, but this does not
>>>         seem doable just operating on the MRS resulting from
>>>         composition, even with access to the SEM-I.
>>>
>>>
>>>         So if the necessary ICONS elements have to be introduced
>>>         overtly by the lexicon/grammar during composition, then I
>>>         would still like to explore a middle ground that does not
>>>         result in the full set of ICONS elements Soon and Bender
>>>         propose for a sentence.  That is, I wondered whether we
>>>         could make do with adding to the ERG the necessary
>>>         introduction of just those ICONS elements that would enable
>>>         us to draw the distinctions between `unmarked', 'topic', and
>>>         'focus' that we were used to exploiting in the days of
>>>         messages.   But since pretty much any preposition's or
>>>         adjective's or verb's complement can be extracted, and any
>>>         verb's subject can be extracted, and most verbs' direct and
>>>         indirect objects can be passivized, I think we'll still end
>>>         up with an ICONS entry for each eventuality/argument pair
>>>         for every predication-introducing verb, adjective, and
>>>         preposition in a sentence, and maybe also for some nouns as
>>>         in "who is that picture of?".  This still lets us exclude
>>>         ICONS elements involving adverbs and maybe also the
>>>         arguments of conjunctions, subordinators, modals.  If we
>>>         went this route, I think it would be possible to make modest
>>>         additions to certain of the constructions, and not have to
>>>         meddle with lexical types, to get these ICONS elements into
>>>         the MRS during composition.
>>>
>>>
>>>         Such a partial approach does not have the purity of Soon and
>>>         Bender's account, but might be more practical, at least as a
>>>         first step, for the ERG.  It would at least enable what I
>>>         think is a more consistent interpretation of the ICONS
>>>         elements for generation, and should give us the fine-grained
>>>         control I agree that we want.  Thus to get the generator to
>>>         produce all variants from an MRS produced by parsing a
>>>         simple declarative, one would have to remove the info-str
>>>         ICONS element whose presence excludes the specialization to
>>>         focus or topic because of our friend Skolem.
>>>
>>>
>>>         Counsel?
>>>
>>>
>>>          Dan
>>>
>>>
>>>         ------------------------------------------------------------------------
>>>         *From:* developers-bounces at emmtee.net
>>>         <mailto:developers-bounces at emmtee.net>
>>>         <developers-bounces at emmtee.net>
>>>         <mailto:developers-bounces at emmtee.net> on behalf of Ann
>>>         Copestake <aac10 at cam.ac.uk> <mailto:aac10 at cam.ac.uk>
>>>         *Sent:* Friday, February 5, 2016 1:43 PM
>>>         *To:* Emily M. Bender; Stephan Oepen
>>>         *Cc:* developers; Ann Copestake
>>>         *Subject:* Re: [developers] ICONS and generation
>>>         Thanks!
>>>
>>>         On 05/02/2016 21:30, Emily M. Bender wrote:
>>>>         Not sure if this answers the question, but a couple of
>>>>         comments:
>>>>
>>>>         (a) I do think that written English is largely
>>>>         underspecified for information structure.
>>>>         It's part of what makes good writing good that the
>>>>         information structure is made apparent
>>>>         somehow.
>>>>
>>>
>>>         OK.  should I understand you as saying that composition (as
>>>         in, what we do in the grammars) leaves it mostly
>>>         underspecified, but that discourse level factors make it
>>>         apparent?  or that it really is underspecified?
>>>
>>>>         (b) I think the "I want only the unmarked form back" case
>>>>         might be handled by either
>>>>         a setting which says "no ICONS beyond what as in the input"
>>>>         (i.e. your ICONS { }) or
>>>>         a pre-processing/generation fix-up rule that takes ICONS {
>>>>         ... } and outputs something
>>>>         that would be incompatible with anything but the unmarked
>>>>         form.  Or maybe the
>>>>         subsumption check goes the wrong way for this one?
>>>>
>>>         Yes, I think the ICONS {} might be a possible way of
>>>         thinking about it.  I should make it clear - I don't think
>>>         there's a problem with constructing an implementation that
>>>         produces the `right' behaviour but I would much prefer that
>>>         the behaviour is specifiable cleanly in the formalism rather
>>>         than as another parameter to the generator or whatever.
>>>
>>>>         I hope Sanghoun has something to add here!
>>>>
>>>>         Emily
>>>>
>>>>         On Fri, Feb 5, 2016 at 1:01 PM, Stephan Oepen
>>>>         <oe at ifi.uio.no <mailto:oe at ifi.uio.no>> wrote:
>>>>
>>>>             colleagues,
>>>>
>>>>             my ideal would be a set-up where the provider of
>>>>             generator inputs has three options: (a) request
>>>>             topicalization (or similar), (b) disallow it, or (c)
>>>>             underspecify and get both variants.
>>>>
>>>>             we used to have that level of control (and flexibility)
>>>>             in the LOGON days where there were still messages: in
>>>>             the message EPs, there were two optional ‘pseudo’ roles
>>>>             (TPC and PSV) to control topicalization or
>>>>             passivization of a specific instance variable.
>>>>              effectively, when present, these established a binary
>>>>             relation between the clause and one of its
>>>>             nominal constituents.  if i recall correctly, blocking
>>>>             topicalization was accomplished by putting an otherwise
>>>>             unbound ‘anti’-variable into the TPC or PSV roles.
>>>>
>>>>             could one imagine something similar in the ICONS realm,
>>>>             and if so, which form would it have to take?
>>>>
>>>>             best wishes, oe
>>>>
>>>>
>>>>             On Friday, February 5, 2016, Woodley Packard
>>>>             <sweaglesw at sweaglesw.org
>>>>             <mailto:sweaglesw at sweaglesw.org>> wrote:
>>>>
>>>>                 I can confirm that under ACE, behavior is what you
>>>>                 indicate, i.e. generating from parsing the
>>>>                 topicalized feline-canine-playtime I get just the
>>>>                 topicalized variant out, but when generating from
>>>>                 parsing the ordinary word order I get all 5
>>>>                 variants out.
>>>>
>>>>                 I believe this was designed to imitate the
>>>>                 long-standing condition that the MRS of generation
>>>>                 results must be subsumed by the input MRS.  The
>>>>                 observed behavior seems to me to be the correct
>>>>                 interpretation of the subsumption relation with
>>>>                 ICONS involved. Note that an MRS with an extra
>>>>                 intersective modifier would also be subsumed, for
>>>>                 example, but such MRS are never actually generated
>>>>                 since those modifier lexical entries never make it
>>>>                 into the chart.
>>>>
>>>>                 It’s certainly reasonable to ask whether (this
>>>>                 notion of) subsumption is really the right test.
>>>>                 I’ve met lots of folks who prefer to turn that
>>>>                 subsumption test off entirely.  I guess it’s also
>>>>                 possible that the subsumption test is right for the
>>>>                 RELS portion of the MRS but not for the ICONS,
>>>>                 though that seems a bit odd to consider. However,
>>>>                 given that we don’t have many ideas about
>>>>                 truth-conditional implications of ICONS, maybe not
>>>>                 so odd.
>>>>
>>>>                 I don’t really have much to offer in terms of
>>>>                 opinions about what the right behavior should be. 
>>>>                 I (believe I) just implemented what others asked
>>>>                 for a couple years ago :-)
>>>>
>>>>                 -Woodley
>>>>
>>>>                 > On Feb 5, 2016, at 8:03 AM, Ann Copestake
>>>>                 <aac10 at cl.cam.ac.uk <mailto:aac10 at cl.cam.ac.uk>> wrote:
>>>>                 >
>>>>                 > I'm part way through getting ICONS support
>>>>                 working in Lisp, testing on the version of the ERG
>>>>                 available as trunk. I have a question about
>>>>                 generation. If I implemented the behaviour
>>>>                 described in http://moin.delph-in.net/IconsSpecs
>>>>                 there doesn't seem to be a way of specifying that I
>>>>                 want a `normal' ordering for English.
>>>>                 >
>>>>                 > e.g., if I take the MRS resulting from
>>>>                 >
>>>>                 > that dog, the cat chased.
>>>>                 >
>>>>                 > without ICONS check, there are 5 realizations,
>>>>                 including the `null ICONS' case `The cat chased
>>>>                 that dog.'  With an exact ICONS check, I can select
>>>>                 realizations with the same ICONS (modulo order of
>>>>                 ICONS elements, of course, in the case where
>>>>                 there's more than one element).  But with the
>>>>                 <http://moin.delph-in.net/IconsSpecs>http://moin.delph-in.net/IconsSpecs
>>>>                 behaviour, there's no way of specifying I want a
>>>>                 `normal' order - if I don't give an ICONS, I will
>>>>                 always get the 5 realisations. In fact, as I
>>>>                 understand it, I can always end up with more icons
>>>>                 in the realisation than in the input, as long as I
>>>>                 can match the ones in the input.
>>>>                 >
>>>>                 > So:
>>>>                 > - is the IConsSpec behaviour what is desired for
>>>>                 the ERG (e.g., because one can rely on the
>>>>                 realisation ranking to prefer the most `normal' order)?
>>>>                 > - or does the ERG behave differently from Emily
>>>>                 and Sanghoun's grammars, such that different
>>>>                 generator behaviour is desirable? and if so, could
>>>>                 we change things so we don't need different behaviours
>>>>                 >
>>>>                 > Ann
>>>>                 >
>>>>                 >
>>>>                 >
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         Emily M. Bender
>>>>         Professor, Department of Linguistics
>>>>         Check out CLMS on facebook!
>>>>         <http://www.facebook.com/uwclma>http://www.facebook.com/uwclma
>>>
>>
>>
>>
>>
>>     -- 
>>     =================================
>>     Sanghoun Song
>>     Assistant Professor
>>     Dept. of English Language and Literature
>>     Incheon National University
>>     http://corpus.mireene.com
>>     phone: +82-32-835-8129 (office)
>>     =================================
>
>
>
>
> -- 
> Emily M. Bender
> Professor, Department of Linguistics
> Check out CLMS on facebook! http://www.facebook.com/uwclma

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20160218/d2f65db1/attachment-0001.html>


More information about the developers mailing list