[matrix] Pronoun Relations

Fri Feb 16 06:14:23 CET 2007

Hi Ann,

Thanks for the quick reply!

Ann Copestake wrote:
> ebender at u.washington.edu said:
> > I think it follows that we don't lose much by going to a
> > no-pronoun-rel strategy in the Matrix while keeping the
> > pronoun_rels for the ERG: we're going to have differences between
> > languages no matter what we do.
> 
>
> I would argue that it means that anyone using the Matrix to develop
> a grammar for a language with pronouns will have to adopt a more
> convoluted anaphora resolution strategy than is necessary.  It will
> also mean that code I incorporate into the LKB for anaphora
> resolution based on our ERG work can't be used for Matrix grammars
> without adaptation.

In order for the anaphora resolution based on the ERG to work for
other languages, you're going to require not only pronoun_rels for
overt pronouns but also pronoun_rels for dropped arguments.  And
pronoun_rels for dropped arguments require quantifiers to bind the
index, which leads to lots of possible scopings...  Unless something
has changed since 12/05, JACY isn't putting in those pronouns.  The
only non-Germanic language I can think of with English-like instance
on overt arguments is French, and even that is questionable (depending
on what you say about the clitics).

Ann Copestake wrote:
> I was talking about systematic language differences in general when
> I used the term `overhaul'.  But I think you may underestimate the
> effects that changes in the MRSs have for people who are using the
> grammars.  Whether all nominal indices associated with non-optional
> arguments are associated with relations or not is something that
> potentially matters.  I can't tell you for sure whether it's
> something I've assumed in code or not.  Obviously this is the sort
> of thing that I would be prepared to fix if it's necessary, but, as
> you well know, my fixes don't often happen quickly ...

What is your operational definition of "non-optional argument"?
Is this something that's annotated in the Sem-I?  

Ann Copestake wrote:
> ebender at u.washington.edu said:
> > It seems like taking the cfrom/cto from the selecting predicate
> > should work.  I tried for a bit to come up with some examples in
> > English that would tease the two apart, but without success
> > (largely because it's hard to separate pronouns from their
> > selecting predicates by enough material to stick in an
> > antecedent).
> 
> `the selecting predicate'? -  for control cases, there will be multiple verb 
> predicates.  I don't want to argue that there is no algorithm that will do 
> this, just that it complicates things.

Sure, fair enough, but I don't think that that's actually that hard.
In control cases, there's a well-known relationship between the predicates
and so it should be straightforward to determine which predicate to
pin the pronoun on.  Again, unless we're dropping in pronoun_rels for
unexpressed arguments in other languages, we're going to have to
solve this in the general case anyway.

Ann Copestake wrote:
> ebender at u.washington.edu said:
> > Wouldn't it be possible to relate the pronouns to their indices by
> > looking at the actual derivations, and going from pronouns to MRS,
> > rather than the other way around?
>
> I don't follow - you mean keep the feature structures around???  We
> could perhaps change the parsers to allow CFROM/CTO values to be
> placed on indices iff there was a pronoun - but it would be
> time-consuming and messy (especially if there's a treatment of
> binding theory in a grammar).

This is what I get for picking up a cold thread.  That remark was in
the context of something you'd said about evaluation. I thought you meant
that you have to identify the pronouns for comparison against a gold
standard.  Thus I wasn't suggesting using the whole fs in the process
of actual reference resolution, just in the construction of the test data.

Ann Copestake wrote:
> ebender at u.washington.edu said:
> > Is this kind of variation relevant to anaphora resolution though?
> > I think the best representation of politeness is through a
> > separate relation predicated of that individual (perhaps in CTXT
> > or Background, rather than in the CONT).  If we look at reference
> > resolution more generally (rather than just anaphora resolution) I
> > think we'll have other reasons for taking more relations into
> > account (beyond just the noun-relation introducing the index).
>
> I'm just asking if you want to force the position where every type
> of variation in pronouns has to be accounted for on the index or
> elsewhere in the MRS/information structure.  Maybe this is right in
> the long term but I see it as a potential source of problems.

I think that is probably a reasonable position: A working definition
of a pronoun is a word which refers to a discourse element without
predicating anything in particular of it.  Additional information encoded
in pronouns involves grammatical categories (person, number, gender)
which serve to narrow down the possible class of antecedents.
This line of reasoning would also seem to support the idea that
pronouns shouldn't introduce "relations".

Consider a language which has polite pronouns and matching polite
inflection on the verb, and furthermore allows the pronouns to be
dropped.  In that case, if the politeness information is encoded
in the pronoun relation, what do you do when the pronoun is dropped?
I see two possibilities:  Provide no pronoun relation, and lose
the politeness information.  Provide a pronoun relation of some sort,
but to get the right pronoun relation would probably require 
multiplying the pro-drop rules.  

Ann Copestake wrote:
> 2) we have multiple users of MRSs with different needs and we need
> to try and reconcile these.  (I do not want to end up in a situation
> where code that operates on MRSs is usable only with ERG MRSs and
> not Matrix MRSs, but it seems to me that's what is liable to
> happen.)

I agree that reusability of code is at least half of the point of the
Matrix exercise.  At the same time, I think it's important to have
discussions like this one which make room for airing data from
languages other than English in the course of making design decisions
about MRSs which in turn impact the design of code that interprets
MRSs in various ways.  I know that this discussion started off with an
MT-motivated question, but my bigger concern is actually the question
of whether (and when) to hallucinate pronoun_rels for dropped
arguments, especially in languages like Japanese.

Emily