[developers] Dropped arguments in DMRS
Ann Copestake
aac10 at cl.cam.ac.uk
Tue Jan 12 20:38:44 CET 2016
Thanks!
I assume that the unexpressed argument cases like English `eat' are
different from dropped arguments in Spanish (and I guess Japanese)
because these apply across the board rather than being lexically
specific. I also assume it's clearer to a naive speaker that something
is `missing'. But I assume that putting back zero-pronouns just for
the case where there's a coindexation is not something that one would
want to try and do in a grammar - it doesn't seem likely it could be
done compositionally and monotonically, if it could be done at all.
Anyway, I will leave it up to others to decide what they want to do, but
let's leave it that I am willing to try and add support directly in DMRS
if that's thought to be helpful.
All best,
Ann
On 12/01/2016 01:43, Michael Wayne Goodman wrote:
> I think we're all in agreement that the reduced clutter of DMRS and
> EDS is a good thing. I brought up the ICONS and
> coindexed-dropped-arguments issues as examples where the dependency
> representations are currently unable to be equivalent with existing
> MRS representations, but I think we'd all agree that we shouldn't just
> do whatever it takes to make them equivalent. More below...
>
> On Mon, Jan 11, 2016 at 1:22 PM Ann Copestake <aac10 at cl.cam.ac.uk
> <mailto:aac10 at cl.cam.ac.uk>> wrote:
>
> Hi all,
>
> On 11/01/2016 16:41, Stephan Oepen wrote:
> > if we were to add code to synthesize nodes for variables that
> are not
> > introduced as the distinguished (or characteristic) variable of
> any EP
> > but occur at least twice in an MRS, it would seem natural to me to
> > leave these nodes unlabeled (they will not have characterization or
> > other surface links either). this would also indicate that they
> have a
> > somewhat different status formally (at least in terms of
> > correspondences to a full MRS).
>
> we can't really leave them unlabelled in DMRS because they
> wouldn't show
> up very well ... From my perspective, the alternatives to
> changing the
> DMRS code to allow them are 1. put zero pronouns back into JACY
> (without
> quantifiers) 2. add zero pronouns in some sort of post-processing step
> 3. add them to DMRS. Mike mentioned ICONS - but it doesn't help here
> because the nodes that would have to be equated don't exist. Or, at
> least, since ICONS can do absolutely anything, one could define a
> variant of ICONS which did encode it properly, but it's really too
> much
> of a stretch.
>
>
> In MRS, the ICONS solution might look like:
>
> RELS: < ... [ ... ARG1 i5 ...] [ ... ARG2 i6 ... ] ... >
> ICONS: < ... i5 equal i6 ... >
>
> where i5 and i6 are the dropped arguments that refer to the same thing
> (i.e., instead of coindexing the variable on both arguments). However,
> you're right that this would not naturally extend to DMRS, because it
> would require the relation to encode the rargname of both the source
> and target nodes. E.g., if we consider an extension to DMRS with
> <icons> as a sibling element to <node> and <link> elements:
>
> ...
> <icons>
> <iarg1 nodeid="10001"><rargname>ARG1</rargname></iarg1>
> <iarg2 nodeid="10002"><rargname>ARG2</rargname></iarg2>
> <ireln>equal</ireln>
> </icons>
> ...
>
> I know. Yuck. Also, such graph would no longer be simple nodes and
> links, but instead nodes and links and icon-edges, which, at best, is
> harder to explain to DELPH-IN muggles. It also feels like we're just
> flailing about pretending to know nothing about the things we are
> equating.
>
> So I now agree that the simplest and most enticing solution is some
> kind of zero-pronoun, whether it's defined by the grammar (solutions
> (1) and (2)) or by the DMRS formalism itself (solution (3)). In
> addition, as you've mentioned, a zero-pronoun would helpfully
> contribute a nodeid that could be used for anaphoric pronouns in
> discourse, if that is something we'll need later.
>
> I believe the zero pronouns were dropped from Jacy precisely because
> of the clutter. Japanese drops a lot of arguments, and each one could
> result in two more uninteresting EPs (the zero pronoun and its
> quantifier). On a related note, my work on computing MRS isomorphism
> found that a major factor that caused inefficient computation was
> having many EPs with the same predicates (e.g., lots of udef_q_rels).
> Therefore, I don't think it's a great idea to reintroduce
> zero-pronouns in Jacy, unless we could constrain it somehow (e.g.,
> only for coindexation, without quantifiers, etc.). I don't have strong
> arguments for choosing (2) or (3). With (2), it might be easier to
> insert variable properties on the zero-pronoun (e.g. evidenced from
> verb agreement, like the dropped subjects in Spanish or something),
> while (3) can help keep the representation consistent across grammars
> (since its defined by the formalism and not the grammar). (Aside:
> Japanese doesn't have verbal agreement in terms of PNG, but it does
> encode honorifics, even for dropped arguments. It would nice to keep
> this information for the zero pronouns somewhere.)
>
> Lastly, granting a nodeid for unexpressed arguments can help with
> other ICONS relations (i.e., of the information structure kind, not
> the variable-equating kind we were talking about before) where one of
> the participants is unexpressed.
>
> > i share your sentiment that the disappearing of unexpressed
> arguments
> > in our dependency graphs in general reduces clutter and is
> desirable.
> > co-indexation of such (unexpressed) variables admittedly challenges
> > that position. if we end up special-coding for these cases, it
> would
> > be good to have the motivating examples and analyses readily
> available
> > (and publicly vetted). i believe emily may have been the first to
> > argue for such co-indexation, probably from her work in the
> > grammarium?
>
>
> http://moin.delph-in.net/SingaporeMrsWellformedness
>
> the example Mike gave was tabe-sugiru = eat-exceed = overeat
>
> while it would be very good if someone could write this up or point us
> to a proper write up, I don't think there's much room for argument
> about
> needing it for Japanese, unless one uses zero pronouns
>
>
> >
> > —i recall we have talked once or twice in the past about adding an
> > explicit distinction of unexpressed variables. for the ERG at
> least,
> > i believe dan (and others) often look at ‘u’ and ‘i’ (and maybe ‘p’)
> > as varible sorts that indicate unexpressed arguments. but that is at
> > best a convention and prevents stronger typing of argument slots as
> > would be desirable. for example, the ARG2 of _eat_v_1
> presumably must
> > always be an ‘x’ when expressed, but dan abstains from putting that
> > type into the lexical entry because the scoping machinery would
> > complain at ‘x’-typed variable without a quantifier.
> >
> > would it work (and be desirable) to introduce a variable
> property, say
> > [ XP bool ], to distinguish expressed from unexpressed roles? i
> > imagine it would not be hard to make all constructions that bind
> roles
> > specialize XP to true; one could then use the VPM machinery upon MRS
> > read-out to default remaining (unspecific) XP values to false.
> > alternatively, i imagine one could obtain the same effect by making
> > the hierarchy of variable types a little richer, i.e. put something
> > above at least ‘x’ and ‘e’ to indicate unxpressed variants, say ‘w’
> > and ‘d’ (the immediately preceding letters :-).
> >
> > any thoughts on actually introducing such an explicit marking of
> > unexpressed arguments?
> >
> > all best, oe
>
> The problem with handling unexpressed arguments `properly' is that
> there
> are multiple different classes of unexpressed arguments, as I
> outlined.
> In some cases in the ERG, verbs with optional arguments have
> unexpressed
> arguments in the semantics, while other cases don't. This also
> interacts with the desire to save on predicate names that has caused
> many predicates to appear with different arities (which, of course,
> isn't OK if one translates directly to a conventional logical
> representation and has to be interpreted as some sort of notational
> convenience).
>
> e.g., the ERG demo gives:
>
> Kim understood understand_v_by (e, x, p)
>
> Kim understood the sentence / Sandy understand_v_by (e, x, x')
>
> Kim understood that Sandy was scared understand_v_by (e, x, h, i)
>
> Kim ran run_v_1 (e, x)
>
> Kim ran the race / the store run_v_1 (e, x, x')
>
> Kim hoped hope_v_1 (e, x)
>
> Kim dreamed dream_v_1 (e, x, p)
>
> I don't find this intuitive but we don't have a test set or
> criteria for
> *MRS which would make it clear why one representation is to be
> preferred
> over another, and I find it hard to imagine what such criteria could
> be. That's why I talked about anaphora in the previous message, since
> that could have been an example of a clear cut difference, though it
> seems (to me) it probably isn't. Failing such criteria, I don't
> want to
> argue that there's a problem with the ERG representations but it also
> means that dropping them gives one less thing to worry about when
> we're
> actually using the output.
>
> So - I don't think it'll be a big hassle to add them to the DMRS code,
> but I don't propose to add them to the DMRS formal description and I
> don't think it's worthwhile expending energy on trying to clean this
> up. There are ways to allow argument slot typing without messing
> up the
> scope machinery, if that's something that needs to be fixed.
>
> All best,
>
> Ann
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20160112/f2302b8a/attachment.html>
More information about the developers
mailing list