<html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> Thanks - would anyone else like to comment? I'm currently not sure whether simply enriching ICONS (and possibly HCONS) with a ... notation would be enough to give us what's needed in terms of behaviour but I suspect that needs to be part of the solution.<br> <br> Ann<br> <br> <div class="moz-cite-prefix">On 18/02/16 00:26, Emily M. Bender wrote:<br> </div> <blockquote cite="mid:CAMype6cKCD6B42nhaXxmauZGAK1mTg=zZgOA0yx5b1Eo3mvwhA@mail.gmail.com" type="cite"> <div dir="ltr">Just a quick and belatedly reply to say that from where I sit your analysis <div>of the situation makes a lot of sense.</div> <div><br> </div> <div>Emily</div> </div> <div class="gmail_extra"><br> <div class="gmail_quote">On Sun, Feb 7, 2016 at 2:12 AM, Ann Copestake <span dir="ltr"><<a moz-do-not-send="true" href="mailto:aac10@cam.ac.uk" target="_blank">aac10@cam.ac.uk</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"> Thanks! and thanks all! I've come to a view on this which I think is consistent with what everyone has been saying.<br> <br> First of all, note that in the MRS syntax, we do not distinguish between terminated and non-terminated lists/bags. If we think about it from the perspective of typed feature structures, it is clear that there is a distinction - for instance a type `list' is the most general type of list, and the type `e-list' (empty list) is usually a maximally specific type. Coming back to the notation I used in an earlier message, there is a distinction between { ... } (analogous to list in a TFS) and {} (cf e-list). <br> <br> Now, there are two possible interpretations of ICONS as it arises from a DELPH-IN grammar (i.e., as it is output after parsing):<br> 1. information structure<br> 2. information structure as it arises from morphosyntax<br> <br> In the `normal' sentences of the `Kim chased the dog' type, no information structure elements arise from morphosyntax. We can, however, expect that various contexts (e.g., discourse) give rise to information structure in association with such a sentence. Hence, with respect to interpretation 1, ICONS is not strictly empty but underspecified (and similarly a one element ICONS may be underspecified with respect to a two-element ICONS and so on). I think this is consistent with what Sanghoun and Emily are saying. Under this interpretation, an MRS with no specification of ICONS should indeed generate all the variants sentences we've been discussing. And so on.<br> <br> However, with respect to interpretation 2, the ICONS emerging from the parse of such a sentence is terminated. Once we've finished parsing, we're guaranteeing no more ICONS elements will arise from morphosyntax, whatever someone does in discourse. Under this interpretation, if I say I want to generate a sentence with an empty ICONS, I mean I want to generate a sentence with no ICONS contribution from morphosyntax. This is also a legitimate use of the realiser, considered as a stand-alone module.<br> <br> Since ICONS is something which I have always thought of as on the boundary of morphosyntax and discourse, I want to be able to enrich ICONS emerging from parsing with discourse processing, so interpretation 1 makes complete sense. However, I believe it is also perfectly legitimate to be able to divide the world into what the grammar can be expected to do and what it can't, and that is consistent with interpretation 2.<br> <br> As a hypothetical move, consider an additional classification of ICONS elements according to whether or not they arise from morphosyntax. Then we can see that a single ICONS value could encompass both interpretations. i.e., what would arise from a parse would be a terminated list of morphosyntactic-ICONS elements but the ICONS as a whole could be non-terminated.<br> <br> I think there may be reasons to be able to distinguish ICONS elements according to whether they are intended as grammar-derived or not, though I do see this might look messy. But anyway, I want to first check that everyone agrees with this analysis of the situation before trying to work out what we might do about it in terms of notation. <br> <br> Incidentally - re Dan's message - my overly brief comment about Sanghoun's use of DMRS earlier was intended to point out that if DMRS had the necessary links for demonstrating ICONS, then in principle this was something we know how to extract. But right now, I'm not clear whether or not we do need all the underspecified elements, and that's something I would like Dan to comment on before we go further.<br> <br> All best,<br> <br> Ann <div> <div class="h5"><br> <br> <br> <div>On 06/02/2016 17:59, Sanghoun Song wrote:<br> </div> <blockquote type="cite"> <div dir="ltr">My apologies for my really late reply!<br> <br> I am not sure whether I fully understand your discussion, but I would like to leave several my ideas on using ICONS for generation. <br> <br> First, in my analysis (final version), only expressions that contribute to information structure introduce an ICONS element into the list. For example, the following unmarked sentence (a below) has no ICONS element (i.e. empty ICONS). <br> <br> a. plain: Kim chases the dog.<br> b. passivization: The dog is chased by Kim.<br> c. fronting: The dog Kim chases.<br> d. clefting: It is the dog that Kim chases.<br> <br> Using the type hierarchy for information structure in my thesis, I can say the followings<br> <br> (i) The subject Kim and the object the dog in a plain active sentence (a) are in situ. They may or may not be focused depending on which constituent bears a specific accent, but in the sentence-based processing their information structure values had better remain underspecified for flexible representation.<br> <br> (ii) The promoted argument the dog in the passive sentence (b) is evaluated as conveying focus-or-topic, while the demoted argument Kim is associated with non-topic. <br> <br> (iii) In (c), the fronted object the dog is assumed to be assigned focus-or-topic in that the sentence conveys a meaning of either "As for the dog, Kim chases it". or (d), while the subject in situ is evaluated as containing neither topic nor focus (i.e. background). (Background may not be implemented in the ERG, I think.)<br> <br> (iv) The focused NP in (d) carries focus, and the subject in the cleft clause Kim is also associated with bg. <br> <br> Thus, we can create a focus specification hierarchy amongst (a-d) as [clefting > fronting > passivization > plain].<br> <br> What I want to say is that a set of sentences which share some properties may have subtle shades of meaning depending on how focus is assigned to the sentences. Paraphrasing is made only in the direction from the right to the left of [clefting > fronting > passivization > plain], because paraphrasing in the opposite direction necessarily causes loss of information. For example, a plain sentence such as (a) can be paraphrased into a cleft construction such as (d), but not vice versa.<br> <br> In a nutshell, a more specific sentence might not better to be paraphrased into a less specific sentence in terms of information structure. <br> <br> Second, I provided many dependency graphs in my thesis. The main reason was that nobody outside of the DELPH-IN can fully understands the complex co-indexation in ICONS/MRS. At that time, I didn't work on DMRS with respect to ICONS. If there is a way to represent ICONS in DMRS (direct from TFS or via MRS), I am interested in the formalism. <br> <br> <br> Sanghoun<br> <br> </div> <div class="gmail_extra"><br> <div class="gmail_quote">On Sat, Feb 6, 2016 at 1:26 AM, Ann Copestake <span dir="ltr"><<a moz-do-not-send="true" href="mailto:aac10@cam.ac.uk" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:aac10@cam.ac.uk">aac10@cam.ac.uk</a></a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div bgcolor="#FFFFFF" text="#000000"> Briefly (more this evening maybe) - I don't see a particular problem with filling in the ICONS since what you describe are relationships that are overt in the *MRS anyway, aren't they? I thought, in fact, that these are pretty clear from the DMRS graph - which is why Sanghoun uses it to describe what's going on. <br> <br> I believe we can build the DMRS graph direct from the TFS, incidentally - don't need to go via MRS ...<br> <br> Cheers,<br> <br> Ann <div> <div><br> <br> <div>On 05/02/2016 23:40, Dan Flickinger wrote:<br> </div> <blockquote type="cite"> <div style="font-size:12pt;color:#000000;background-color:#ffffff;font-family:Calibri,Arial,Helvetica,sans-serif"> <p>As I understand the Soon and Bender account, an MRS for a sentence should include in the ICONS list at least one element for each individual (eventuality or instance) that is introduced. In the ERG this would mean that the value of each ARG0 should appear in at least one ICONS entry, where most of these would be of the maximally underspecified type `info-str', but possibly specialized because of syntactic structure or stress/accent or maybe even discourse structure.</p> <p><br> </p> <p>I see the virtue of having these overt ICONS elements even when of type `info-str', to enable the fine-grained control that Stephan notes that we want for generation, and also to minimize the differences between the ERG and grammars being built from the Matrix which embody Sanghoun's careful work.</p> <p><br> </p> <p>If the grammarian is to get away with not explicitly introducing each of these ICONS elements in the lexical entries, as Sanghoun does in the Matrix, then it would have to be possible to predict and perhaps mechanically add the missing ones after composition was completed. I used to hope that this would be possible, but now I'm doubtful, leading me to think that there is no good alternative to the complication (maybe I should more kindly use the term `enrichment') of the grammar with the overt introduction of these guys everywhere. Here's my reasoning:</p> <p><br> </p> <p>I assume that what we'll want in an MRS for an ordinary sentence is an ICONS list that has exactly one entry for each pair of an individual `i' and the eventuality which is the ARG0 of each predication in which `i' appears as an argument. Thus for `the cat persuaded the dog to bark' the ICONS list should have four elements: one for cat/persuade, one for dog/persuade, one for bark/persuade, and one for dog/bark. Now if I wanted to have the grammar continue to only insert ICONS elements during composition for the non-vanilla info-str phenomena, and fill in the rest afterward, I would have to know not only the arity of each eventuality-predication, but which of its arguments was realized in the sentence, and even worse, which of the realized syntactic arguments corresponded to semantic arguments (so for example not the direct object of `believe'). Maybe I give up too soon here, but this does not seem doable just operating on the MRS resulting from composition, even with access to the SEM-I.</p> <p><br> </p> <p>So if the necessary ICONS elements have to be introduced overtly by the lexicon/grammar during composition, then I would still like to explore a middle ground that does not result in the full set of ICONS elements Soon and Bender propose for a sentence. That is, I wondered whether we could make do with adding to the ERG the necessary introduction of just those ICONS elements that would enable us to draw the distinctions between `unmarked', 'topic', and 'focus' that we were used to exploiting in the days of messages. But since pretty much any preposition's or adjective's or verb's complement can be extracted, and any verb's subject can be extracted, and most verbs' direct and indirect objects can be passivized, I think we'll still end up with an ICONS entry for each eventuality/argument pair for every predication-introducing verb, adjective, and preposition in a sentence, and maybe also for some nouns as in "who is that picture of?". This still lets us exclude ICONS elements involving adverbs and maybe also the arguments of conjunctions, subordinators, modals. If we went this route, I think it would be possible to make modest additions to certain of the constructions, and not have to meddle with lexical types, to get these ICONS elements into the MRS during composition.</p> <p><br> </p> <p>Such a partial approach does not have the purity of Soon and Bender's account, but might be more practical, at least as a first step, for the ERG. It would at least enable what I think is a more consistent interpretation of the ICONS elements for generation, and should give us the fine-grained control I agree that we want. Thus to get the generator to produce all variants from an MRS produced by parsing a simple declarative, one would have to remove the info-str ICONS element whose presence excludes the specialization to focus or topic because of our friend Skolem.</p> <p><br> </p> <p>Counsel?</p> <p><br> </p> <p> Dan<br> </p> <br> <div style="color:rgb(0,0,0)"> <hr style="display:inline-block;width:98%"> <div dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> <a moz-do-not-send="true" href="mailto:developers-bounces@emmtee.net" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:developers-bounces@emmtee.net">developers-bounces@emmtee.net</a></a> <a moz-do-not-send="true" href="mailto:developers-bounces@emmtee.net" target="_blank"><developers-bounces@emmtee.net></a> on behalf of Ann Copestake <a moz-do-not-send="true" href="mailto:aac10@cam.ac.uk" target="_blank"><a class="moz-txt-link-rfc2396E" href="mailto:aac10@cam.ac.uk"><aac10@cam.ac.uk></a></a><br> <b>Sent:</b> Friday, February 5, 2016 1:43 PM<br> <b>To:</b> Emily M. Bender; Stephan Oepen<br> <b>Cc:</b> developers; Ann Copestake<br> <b>Subject:</b> Re: [developers] ICONS and generation</font> <div> </div> </div> <div>Thanks!<br> <br> <div>On 05/02/2016 21:30, Emily M. Bender wrote:<br> </div> <blockquote type="cite"> <div dir="ltr">Not sure if this answers the question, but a couple of comments: <div><br> </div> <div>(a) I do think that written English is largely underspecified for information structure.</div> <div>It's part of what makes good writing good that the information structure is made apparent</div> <div>somehow.</div> <div><br> </div> </div> </blockquote> <br> OK. should I understand you as saying that composition (as in, what we do in the grammars) leaves it mostly underspecified, but that discourse level factors make it apparent? or that it really is underspecified?<br> <br> <blockquote type="cite"> <div dir="ltr"> <div>(b) I think the "I want only the unmarked form back" case might be handled by either</div> <div>a setting which says "no ICONS beyond what as in the input" (i.e. your ICONS { }) or</div> <div>a pre-processing/generation fix-up rule that takes ICONS { ... } and outputs something</div> <div>that would be incompatible with anything but the unmarked form. Or maybe the</div> <div>subsumption check goes the wrong way for this one?</div> <div><br> </div> </div> </blockquote> Yes, I think the ICONS {} might be a possible way of thinking about it. I should make it clear - I don't think there's a problem with constructing an implementation that produces the `right' behaviour but I would much prefer that the behaviour is specifiable cleanly in the formalism rather than as another parameter to the generator or whatever.<br> <br> <blockquote type="cite"> <div dir="ltr"> <div>I hope Sanghoun has something to add here!</div> <div><br> </div> <div>Emily</div> </div> <div class="gmail_extra"><br> <div class="gmail_quote">On Fri, Feb 5, 2016 at 1:01 PM, Stephan Oepen <span dir="ltr"> <<a moz-do-not-send="true" href="mailto:oe@ifi.uio.no" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:oe@ifi.uio.no">oe@ifi.uio.no</a></a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> colleagues, <div><br> </div> <div>my ideal would be a set-up where the provider of generator inputs has three options: (a) request topicalization (or similar), (b) disallow it, or (c) underspecify and get both variants. <div><br> </div> <div>we used to have that level of control (and flexibility) in the LOGON days where there were still messages: in the message EPs, there were two optional ‘pseudo’ roles (TPC and PSV) <span></span>to control topicalization or passivization of a specific instance variable. effectively, when present, these established a binary relation between the clause and one of its nominal constituents. if i recall correctly, blocking topicalization was accomplished by putting an otherwise unbound ‘anti’-variable into the TPC or PSV roles.</div> <div><br> </div> <div>could one imagine something similar in the ICONS realm, and if so, which form would it have to take?</div> <div><br> </div> <div>best wishes, oe</div> <div> <div> <div><br> <br> On Friday, February 5, 2016, Woodley Packard <<a moz-do-not-send="true" href="mailto:sweaglesw@sweaglesw.org" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:sweaglesw@sweaglesw.org">sweaglesw@sweaglesw.org</a></a>> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I can confirm that under ACE, behavior is what you indicate, i.e. generating from parsing the topicalized feline-canine-playtime I get just the topicalized variant out, but when generating from parsing the ordinary word order I get all 5 variants out.<br> <br> I believe this was designed to imitate the long-standing condition that the MRS of generation results must be subsumed by the input MRS. The observed behavior seems to me to be the correct interpretation of the subsumption relation with ICONS involved. Note that an MRS with an extra intersective modifier would also be subsumed, for example, but such MRS are never actually generated since those modifier lexical entries never make it into the chart.<br> <br> It’s certainly reasonable to ask whether (this notion of) subsumption is really the right test. I’ve met lots of folks who prefer to turn that subsumption test off entirely. I guess it’s also possible that the subsumption test is right for the RELS portion of the MRS but not for the ICONS, though that seems a bit odd to consider. However, given that we don’t have many ideas about truth-conditional implications of ICONS, maybe not so odd.<br> <br> I don’t really have much to offer in terms of opinions about what the right behavior should be. I (believe I) just implemented what others asked for a couple years ago :-)<br> <br> -Woodley<br> <br> > On Feb 5, 2016, at 8:03 AM, Ann Copestake <<a moz-do-not-send="true" href="mailto:aac10@cl.cam.ac.uk" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:aac10@cl.cam.ac.uk">aac10@cl.cam.ac.uk</a></a>> wrote:<br> ><br> > I'm part way through getting ICONS support working in Lisp, testing on the version of the ERG available as trunk. I have a question about generation. If I implemented the behaviour described in <a moz-do-not-send="true" href="http://moin.delph-in.net/IconsSpecs" target="_blank"><a class="moz-txt-link-freetext" href="http://moin.delph-in.net/IconsSpecs">http://moin.delph-in.net/IconsSpecs</a></a> there doesn't seem to be a way of specifying that I want a `normal' ordering for English.<br> ><br> > e.g., if I take the MRS resulting from<br> ><br> > that dog, the cat chased.<br> ><br> > without ICONS check, there are 5 realizations, including the `null ICONS' case `The cat chased that dog.' With an exact ICONS check, I can select realizations with the same ICONS (modulo order of ICONS elements, of course, in the case where there's more than one element). But with the <a moz-do-not-send="true" href="http://moin.delph-in.net/IconsSpecs" target="_blank"> </a><a moz-do-not-send="true" href="http://moin.delph-in.net/IconsSpecs" target="_blank"><a class="moz-txt-link-freetext" href="http://moin.delph-in.net/IconsSpecs">http://moin.delph-in.net/IconsSpecs</a></a> behaviour, there's no way of specifying I want a `normal' order - if I don't give an ICONS, I will always get the 5 realisations. In fact, as I understand it, I can always end up with more icons in the realisation than in the input, as long as I can match the ones in the input.<br> ><br> > So:<br> > - is the IConsSpec behaviour what is desired for the ERG (e.g., because one can rely on the realisation ranking to prefer the most `normal' order)?<br> > - or does the ERG behave differently from Emily and Sanghoun's grammars, such that different generator behaviour is desirable? and if so, could we change things so we don't need different behaviours<br> ><br> > Ann<br> ><br> ><br> ><br> <br> <br> </blockquote> </div> </div> </div> </div> </blockquote> </div> <br> <br clear="all"> <div><br> </div> -- <br> <div> <div dir="ltr">Emily M. Bender<br> Professor, Department of Linguistics<br> Check out CLMS on facebook! <a moz-do-not-send="true" href="http://www.facebook.com/uwclma" target="_blank"> </a><a moz-do-not-send="true" href="http://www.facebook.com/uwclma" target="_blank"><a class="moz-txt-link-freetext" href="http://www.facebook.com/uwclma">http://www.facebook.com/uwclma</a></a><br> </div> </div> </div> </blockquote> <br> </div> </div> </div> </blockquote> <br> </div> </div> </div> </blockquote> </div> <br> <br clear="all"> <br> -- <br> <div> <div>=================================</div> <div>Sanghoun Song</div> <div>Assistant Professor</div> <div>Dept. of English Language and Literature</div> <div>Incheon National University</div> <div><a moz-do-not-send="true" href="http://corpus.mireene.com" target="_blank">http://corpus.mireene.com</a></div> <div>phone: +82-32-835-8129 (office)</div> <div>=================================</div> </div> </div> </blockquote> <br> </div> </div> </div> </blockquote> </div> <br> <br clear="all"> <div><br> </div> -- <br> <div class="gmail_signature"> <div dir="ltr">Emily M. Bender<br> Professor, Department of Linguistics<br> Check out CLMS on facebook! <a moz-do-not-send="true" href="http://www.facebook.com/uwclma" target="_blank">http://www.facebook.com/uwclma</a><br> </div> </div> </div> </blockquote> <br> </body> </html>