[developers] Roots in ACE output

Tue Aug 19 18:19:32 CEST 2014

Hi Woodley and all - 

It would be worthwhile to consider having grammarians refactor the workload of their set of root symbols to minimize the overlap.  I'll give that a little thought for the ERG, but in the meantime I could also be content with an interpretation of the set of root symbols as an ordered list, as another way to `resolve' the ambiguity where more than one root symbol would admit an analysis.

 Dan

----- Original Message -----
From: "Woodley Packard" <sweaglesw at sweaglesw.org>
To: "Michael Wayne Goodman" <goodmami at u.washington.edu>
Cc: "Delph-in developers list" <developers at delph-in.net>
Sent: Tuesday, August 19, 2014 8:32:42 AM
Subject: Re: [developers] Roots in ACE output

Hi Michael,

This has long been a difference between PET on the one hand and ACE and the LKB on the other hand.  There is (at least currently) no way to ask ACE to include root symbols in the derivations it prints.

I'm not sure I'm prepared to agree with your assessment that this is a problem.  In the past I've resisted adding root symbols to derivations because their difference in formal status compared to the rest of the derivation raises some uncomfortable questions.  For instance, do two read-outs with identical sequences of rules and lexemes but different root conditions constitute different derivations or the same derivation?  Since root conditions cannot add information, the two (or more) derivations would not differ e.g. in MRS.  This situation occurs all the time, since for the ERG for example, any derivation that matches root_strict also matches root_informal.  To me it feels like intentionally introducing spurious ambiguity.  The way around that is to sacrifice a degree of declarativity, i.e. make the ordering of root symbols important, and never acknowledging some of what would otherwise be considered the set of valid derivations.  That's what PET does, I believe.

I can see that for the purposes of training parse disambiguation models, the root symbol(s) that licensed a derivation may be a helpful feature -- although to my knowledge that has not been demonstrated in the literature.  A less vexed solution might be to list *all* the root symbols that match a derivation -- either all together on the top node for the (single) derivation, or recorded as a separate field in the tsdb profile.

Curious to hear others thoughts on this (somewhat sore for me) question,
Woodley

On Aug 18, 2014, at 10:22 PM, Michael Wayne Goodman <goodmami at u.washington.edu> wrote:

> Hi Woodley (and Developers on the CC),
> 
> We are noticing that ACE is not giving us the roots in derivation
> trees. This problem came up before for PET and the LKB nearly 6 years
> ago (see http://lists.delph-in.net/archives/developers/2008/001057.html).
> I see this behavior both when I use the `ace` command directly and
> when I batch process with `art`. Is there some way to make sure the
> root nodes show up in the derivations?
> 
> Here is the value of parsing-roots in Jacy's ace/config.tdl:
> 
> ;;;|| parsing-roots || list of root instances enabled for parsing ||
> parsing-roots             := utterance-root.
> 
> And we are not using the -r option of ACE.
> 
> Any suggestions?
> 
> -- 
> -Michael Wayne Goodman