[developers] Dropped arguments in DMRS

Mon Jan 11 22:21:49 CET 2016

Hi all,

On 11/01/2016 16:41, Stephan Oepen wrote:
> if we were to add code to synthesize nodes for variables that are not 
> introduced as the distinguished (or characteristic) variable of any EP 
> but occur at least twice in an MRS, it would seem natural to me to 
> leave these nodes unlabeled (they will not have characterization or 
> other surface links either). this would also indicate that they have a 
> somewhat different status formally (at least in terms of 
> correspondences to a full MRS).

we can't really leave them unlabelled in DMRS because they wouldn't show 
up very well ...  From my perspective, the alternatives to changing the 
DMRS code to allow them are 1. put zero pronouns back into JACY (without 
quantifiers) 2. add zero pronouns in some sort of post-processing step 
3. add them to DMRS.  Mike mentioned ICONS - but it doesn't help here 
because the nodes that would have to be equated don't exist.  Or, at 
least, since ICONS can do absolutely anything, one could define a 
variant of ICONS which did encode it properly, but it's really too much 
of a stretch.

> i share your sentiment that the disappearing of unexpressed arguments
> in our dependency graphs in general reduces clutter and is desirable.
> co-indexation of such (unexpressed) variables admittedly challenges
> that position.  if we end up special-coding for these cases, it would
> be good to have the motivating examples and analyses readily available
> (and publicly vetted).  i believe emily may have been the first to
> argue for such co-indexation, probably from her work in the
> grammarium?

http://moin.delph-in.net/SingaporeMrsWellformedness

the example Mike gave was tabe-sugiru = eat-exceed = overeat

while it would be very good if someone could write this up or point us 
to a proper write up, I don't think there's much room for argument about 
needing it for Japanese, unless one uses zero pronouns

>
> —i recall we have talked once or twice in the past about adding an
> explicit distinction of unexpressed variables.  for the ERG at least,
> i believe dan (and others) often look at ‘u’ and ‘i’ (and maybe ‘p’)
> as varible sorts that indicate unexpressed arguments.  but that is at
> best a convention and prevents stronger typing of argument slots as
> would be desirable.  for example, the ARG2 of _eat_v_1 presumably must
> always be an ‘x’ when expressed, but dan abstains from putting that
> type into the lexical entry because the scoping machinery would
> complain at ‘x’-typed variable without a quantifier.
>
> would it work (and be desirable) to introduce a variable property, say
> [ XP bool ], to distinguish expressed from unexpressed roles?  i
> imagine it would not be hard to make all constructions that bind roles
> specialize XP to true; one could then use the VPM machinery upon MRS
> read-out to default remaining (unspecific) XP values to false.
> alternatively, i imagine one could obtain the same effect by making
> the hierarchy of variable types a little richer, i.e. put something
> above at least ‘x’ and ‘e’ to indicate unxpressed variants, say ‘w’
> and ‘d’ (the immediately preceding letters :-).
>
> any thoughts on actually introducing such an explicit marking of
> unexpressed arguments?
>
> all best, oe

The problem with handling unexpressed arguments `properly' is that there 
are multiple different classes of unexpressed arguments, as I outlined. 
In some cases in the ERG, verbs with optional arguments have unexpressed 
arguments in the semantics, while other cases don't.  This also 
interacts with the desire to save on predicate names that has caused 
many predicates to appear with different arities (which, of course, 
isn't OK if one translates directly to a conventional logical 
representation and has to be interpreted as some sort of notational 
convenience).

e.g., the ERG demo gives:

Kim understood understand_v_by (e, x, p)

Kim understood the sentence / Sandy understand_v_by (e, x, x')

Kim understood that Sandy was scared understand_v_by (e, x, h, i)

Kim ran run_v_1 (e, x)

Kim ran the race   / the store run_v_1 (e, x, x')

Kim hoped hope_v_1 (e, x)

Kim dreamed dream_v_1 (e, x, p)

I don't find this intuitive but we don't have a test set or criteria for 
*MRS which would make it clear why one representation is to be preferred 
over another, and I find it hard to imagine what such criteria could 
be.  That's why I talked about anaphora in the previous message, since 
that could have been an example of a clear cut difference, though it 
seems (to me) it probably isn't.  Failing such criteria, I don't want to 
argue that there's a problem with the ERG representations but it also 
means that dropping them gives one less thing to worry about when we're 
actually using the output.

So - I don't think it'll be a big hassle to add them to the DMRS code, 
but I don't propose to add them to the DMRS formal description and I 
don't think it's worthwhile expending energy on trying to clean this 
up.  There are ways to allow argument slot typing without messing up the 
scope machinery, if that's something that needs to be fixed.

All best,

Ann