[developers] Defaults in TDL

Sat Sep 22 12:00:57 CEST 2018

Am Fr., 21. Sep. 2018 um 23:12 Uhr schrieb goodman.m.w at gmail.com <
goodman.m.w at gmail.com>:

> Returning to this thread so it doesn't fade away...
>
> I don't understand the complexities well enough to have an opinion on
> whether we should use YADU, Guy's variant, or something else.
>
I'm concerned, however, that if it is sufficiently complicated to work out
> the ramifications of the use of defaults, it would outweigh the benefits
> from the perspective of the grammar developer. With this in mind I would
> (...expect that a grammar developer would...) prefer something more
> predictable over something slightly more powerful. I'd even be happy with
> something more restrictive than YADU if it were still useful, easy to
> understand, and could give informative error messages at compile time. That
> is, can we somehow proscribe the complicated corner cases with a more
> restrictive definition?
>

I think it would make sense to have a number of use cases (or unit tests)
to see whether any variant of defaults would help the developer.  I
proposed my variant to deal with the use case Emily gave -- but there could
well be other cases I haven't thought of, as well as side effects I haven't
thought of!  I'm not sure that a more restrictive version of YADU could
deal with Emily's use case.

> Regarding the syntax of these things, Emily and Guy use /#coref, whereas
> Copestake 2002, the ParisDefeasibleConstraints wiki, and the LKB use /l
> (lowercase L for "lexical", which is specified by the
> *description-persistence* parameter in the LKB). I don't really understand
> the difference, but it seems the latter doesn't require a paired
> coreference, as shown in regverb in the standard example below:
>
>     verb := *top* &
>     [ PAST /l #pp,
>       PASTP #p /l #pp,
>       PASSP #p /l #pp ].
>
>     regverb := verb &
>     [ PAST /l "ed" ] .
>
> If these are in fact different instances of the same mechanism, then maybe
> "/#coref" is just "/ #coref" (i.e., a persistent-default operator followed
> by an actual coreference)?
>

The YADU paper (Lascarides and Copestake, 1999) also uses just "/" rather
than "/l" -- although that paper gives typeset AVMs, rather than
machine-readable tdl.  I'd be happy to stick to "/l" if that would reduce
confusion!

>
> On a related note, I recently added to PyDelphin the ability to serialize
> TDL, so I can imagine writing code that would transform a grammar with
> defaults to one without as some kind of compile-time preprocessor. It would
> be great if we could agree on and provide implementations of defaults
> before the next summit.
>

Yes!  So far I haven't been able to find explicit documentation of how
defaults are handled.  My current understanding is that, the compile-time
processor needs to expand type and instance definitions to include all
inherited constraints -- and then the (possibly inherited) default
constraints on instances are turned into hard constraints, while the
(possibly inherited) default constraints on types are discarded.  I would
be happy to be corrected here if I'm wrong!

>
> On Thu, Sep 6, 2018 at 4:17 AM Guy Emerson <gete2 at cam.ac.uk> wrote:
>
>> What I've proposed isn't credulous default unification, although it does
>> use it as part of its definition, since DefFS does.  The YADU paper does
>> describe credulous default unification (definition 9, on page 69, or page
>> 15 of the pdf), citing Carpenter.
>>
>> Credulously unifying [ F #1, G #1 ] into [ F.P c, G.P d ] just yields the
>> singleton set without any new constraints, { [ F.P c, G.P d ] }, because
>> the only piece of information in [ F #1, G #1 ] that is not already in [
>> F.P c, G.P d ] is the re-entrancy, but this re-entrancy is incompatible.
>> It's crucial that [ F #1, G #1 ] does not contain the paths F.Q and G.Q, or
>> else these *would* be incorporated in credulous unification -- but this
>> follows Emily's example with valence-min, where the relevant feature paths
>> are similarly not there.  This is why I proposed first expanding [ F #1, G
>> #1 ] by credulously unifying in the opposite direction.
>>
>> Am Mi., 5. Sep. 2018 um 18:05 Uhr schrieb Ann Copestake <
>> aac10 at cl.cam.ac.uk>:
>>
>>> this seems like it might be credulous default unification?  can't
>>> remember whether we described that in the YADU paper or not - I think the
>>> idea comes from Bob Carpenter.  Sorry I am just procrastinating and don't
>>> have time to read/think properly, or check.
>>>
>>> Cheers,
>>>
>>> Ann
>>>
>>> On 05/09/18 17:56, Guy Emerson wrote:
>>>
>>> I'm going to adapt Alex's extended Nixon diamond example from
>>> http://moin.delph-in.net/StanfordDefaults , because it's a minimal
>>> working example that captures the essence of Emily's example.
>>>
>>> We have:
>>>
>>> a := *top* &
>>>   [ F /#1,
>>>     G /#1 ].
>>>
>>> x := *top* &
>>>   [ P *top*,
>>>     Q *top* ].
>>>
>>> c := *top*.
>>> d := *top*.
>>>
>>> b := a &
>>>   [ F.P c,
>>>     G.P d ].
>>>
>>> I take it that the behaviour we would like to have is that the default
>>> re-entrancy between F and G is pushed down (in b) to a re-entrancy between
>>> F.Q and G.Q.  As I understand, YADU (
>>> http://www.aclweb.org/anthology/J99-1002 ) would discard the
>>> re-entrancy, because it's not compatible with the hard constraints.  (Also,
>>> note that I have deliberately left the features P and Q outside of the
>>> definition of a, to mimic valence-min in Emily's example -- we can't
>>> represent the re-entrancy between F.Q and G.Q in a, but only in b.)
>>>
>>> I have a tentative proposal -- and I say tentative, because Ann has
>>> already warned that YADU can blow up, and what I'm about to suggest would
>>> be probably be even more likely to blow up.  So here be explosive dragons!
>>>
>>> But with that warning aside... YADU represents a defeasible feature
>>> structure in two parts - a normal feature structure (containing the hard
>>> constraints), and a "tail" of default constraints.  In the above example,
>>> only a has a non-empty tail, which just has one thing in it (the
>>> re-entrancy).  With YADU, unifying two defeasible structures involves first
>>> unifying their hard structures, combining the tails, and then discarding
>>> anything in the tail that's incompatible with the hard structure.  My
>>> proposed extension of YADU is not to discard incompatible constraints in
>>> the tail, but rather "expand" them and keep whatever compatible constraints
>>> are in this expanded set.
>>>
>>> This "expansion" of constraints doesn't require new algebraic
>>> operations, because we can use what's already in YADU -- in particular, the
>>> DefFS operation (Definition 12 on page 70, which is page 16 in the pdf).
>>> Essentially, DefFS takes a hard structure and a set of possibly
>>> contradictory default constraints, and adds as many of the default
>>> constraints as possible without creating a contradiction.  My proposal is
>>> to apply DefFS *backwards* -- if we unify two hard structures and find that
>>> some default constraint in the tail is now incompatible with the unified
>>> hard structure, we can use DefFS to "expand" the default constraint.  We
>>> treat the default constraint as the hard structure in DefFS, and we treat
>>> the unified hard structure as the tail in DefFS.  This tells us how much of
>>> the unified hard structure is compatible with the default constraint -- in
>>> other words, we have "expanded" the default constraint in the context of
>>> the unified hard structure.  We can now decompose this expanded structure
>>> into individual constraints, and add them to the new tail, as long as they
>>> don't contradict the new unified hard structure.
>>>
>>> This might be clearest looking at the above example.  We would like to
>>> unify b's feature structure with a's feature structure.  This gives us the
>>> following hard structure, and tail (I am ignoring the types of tail
>>> elements, for simplicity, because we aren't considering when one default
>>> overrides another default):
>>>
>>> [ F x & [ P c,
>>>           Q *top* ],
>>>   G x & [ P d,
>>>           Q *top* ] ].
>>> tail: { F=G }
>>>
>>> To apply DefFS, we first credulously add the hard structure's
>>> constraints to the tail element.  This gives a pair of structures (which
>>> differ in the value of F.P):
>>>
>>> [ F #1 & x & [ P c,
>>>                Q *top* ],
>>>   G #1 ].
>>>
>>> [ F #1 & x & [ P d,
>>>                Q *top* ],
>>>   G #1 ].
>>>
>>> DefFS gives us the generalisation of these two structures, which is:
>>>
>>> [ F #1 & x & [ P *top*,
>>>                Q *top* ],
>>>   G #1 ].
>>>
>>> We can break this up into individual constraints: { F:x, G:x, F=G,
>>> F.P=G.P, F.Q=G.Q }
>>>
>>> Of these five constraints, the first two are already part of the unified
>>> hard structure, and the second two are incompatible with it.  So we are
>>> left with a single element to keep in the tail: F.Q=G.Q.  So, expanding
>>> elements in the tail rather than discarding them gives:
>>>
>>> [ F x & [ P c,
>>>           Q *top* ],
>>>   G x & [ P d,
>>>           Q *top* ] ].
>>> tail: { F.Q=G.Q }
>>>
>>> Finally, the last step (as with YADU) is to apply DefFS, which would
>>> give the following compiled structure used at runtime (supposing I've
>>> understood what is supposed to happen at compile time -- I wasn't sure
>>> where to look for this documentation):
>>>
>>> [ F x & [ P c,
>>>           Q #1 ],
>>>   G x & [ P d,
>>>           Q #1 ] ].
>>>
>>> Am Di., 4. Sep. 2018 um 23:10 Uhr schrieb Emily M. Bender <
>>> ebender at uw.edu>:
>>>
>>>> Dear Mike,
>>>>
>>>> Thanks for bringing up this issue.  At the 2010 DELPH-IN Paris Summit,
>>>> Ann and I had a further conversation about this, from which I took the
>>>> homework of typing up what it is I'd like to have (as a grammar developer,
>>>> and especially from the point of view of the Matrix) wrt to defeasible
>>>> constraints.  Here's what I wrote down later that year (Oct 27):
>>>>
>>>> Dear Ann,
>>>>
>>>> Here, with much more delay than I intended, is the write up
>>>> I promised of my (reconstruction of my) understanding of where
>>>> we ended up in our discussion of defeasible identity constraints
>>>> over crepes in Paris.
>>>>
>>>> First, why I want it:
>>>>
>>>> In lexical rules, we want to be able say (like in SWB) that
>>>> the value of certain features (HOOK, CAT, ARG-ST) is shared
>>>> between the mother and the daughter unless the rule contradicts
>>>> this.  If the rule does contradict it, then we want only the information
>>>> specifically stated as such to change, and the rest "around" it,
>>>> to be shared.
>>>>
>>>> For a concrete example, take a hypothetical lexical rule that
>>>> changes the case on the first complement from acc to dat.
>>>>
>>>> First, here's the general lex rule type:
>>>>
>>>> lex-rule := phrase-or-lexrule & word-or-lexrule &
>>>>   [ NEEDS-AFFIX bool,
>>>>     SYNSEM.LOCAL.CONT [ RELS [ LIST #first,
>>>>                              LAST #last ],
>>>>                         HCONS [ LIST #hfirst,
>>>>                                 LAST #hlast ] ],
>>>>     DTR #dtr & word-or-lexrule &
>>>>         [ SYNSEM.LOCAL.CONT [ RELS [ LIST #first,
>>>>                                      LAST #middle ],
>>>>                               HCONS [ LIST #hfirst,
>>>>                                       LAST #hmiddle ] ],
>>>>           ALTS #alts ],
>>>>     C-CONT [ RELS [ LIST #middle,
>>>>                     LAST #last ],
>>>>              HCONS [ LIST #hmiddle,
>>>>                      LAST #hlast ]],
>>>>     ALTS #alts,
>>>>     ARGS < #dtr > ].
>>>>
>>>> And a subtype with the defeasible identity indicated
>>>> (using /# for now):
>>>>
>>>> defeasible-identity-lex-rule := lex-rule &
>>>>   [ SYNSEM.LOCAL.CAT <http://synsem.local.cat/> /#cat,
>>>>     ARG-ST /#arg-st,
>>>>     C-CONT.HOOK /#hook,
>>>>     DTR [ LOCAL [ CAT /#cat,
>>>>                            CONT.HOOK /#hook ],
>>>>              ARG-ST /#arg-st ]].
>>>>
>>>> The lex rule definition itself would just look like this:
>>>>
>>>> acc-to-dat-obj-lex-rule := lex-rule &
>>>>  [ SYNSEM.LOCAL.CAT.COMPS.FIRST.LOCAL.CAT.HEAD.CASE dat,
>>>>    DTR.SYNSEM.LOCAL.CAT.COMPS.FIRST.LOCAL.CAT.HEAD.CASE acc ].
>>>>
>>>> The intended behavior is for that to compile into a rule that
>>>> includes constraints like these (I'm sure I'm missing some here):
>>>>
>>>> acc-to-dat-obj-lex-rule (expanded):
>>>>  [ SYNSEM.LOCAL.CAT <http://synsem.local.cat/> [ HEAD #head,
>>>>                                       VAL [ SPR #spr,
>>>>                                                SPEC #spec,
>>>>                                                SUBJ #subj,
>>>>                                                COMPS [ REST #rest,
>>>>                                                              FIRST [
>>>> NON-LOCAL #non-local,
>>>>   LOCAL [ CONT #cont,
>>>>                 CAT [ VAL #val,
>>>>                           AGR #agr,
>>>>                           HEAD.CASE dat ]]]]]],
>>>>   C-CONT.HOOK #hook,
>>>>   ARG-ST #arg-st,
>>>>   DTR [ SYNSEM.LOCAL [ CONT.HOOK #hook,
>>>>                                        CAT [ HEAD #head,
>>>>                                                  VAL [ SPR #spr,
>>>>                                                           SPEC #spec,
>>>>                                                           SUBJ #subj,
>>>>                                                           COMPS [ REST
>>>> #rest,
>>>>
>>>>  FIRST [ NON-LOCAL #non-local,
>>>>             LOCAL [ CONT #cont,
>>>>                           CAT [ VAL #val,
>>>>                                    AGR #agr,
>>>>                                    HEAD.CASE acc ]]]]]]],
>>>>            ARG-ST #arg-st ]].
>>>>
>>>>
>>>> What I remember from Paris is that we decided it would be best to
>>>> encode these constraints not directly in the type definition as
>>>> I did above in defeasible-identity-lex-rule but in a collateral file
>>>> that
>>>> instructs the LKB to do something special with certain feature paths
>>>> on instances of certain types at compile time.
>>>>
>>>> We also worked out that we would only be able to "push down" the
>>>> identity
>>>> constraint to features that were necessitated by the types invoked
>>>> in the rule.  Thus in the example above, we know that SPR, SPEC and SUBJ
>>>> need to be identified because the value of VAL is necessarily "valence"
>>>> (and not valence-min) as we've mentioned COMPS. But if CASE were
>>>> appropriate
>>>> for both noun and comp (for example), then we wouldn't be able to know
>>>> to
>>>> put in identity constraints for any other features of noun (or comp).
>>>> If
>>>> the daughter in fact had a constraint on one of these other features,
>>>> it wouldn't be copied up to the mother.  Relatedly, we lose the actual
>>>> HEAD
>>>> value because we can't identify HEAD while changing CASE.  (So here,
>>>> the grammar
>>>> writer would need to stipulate [HEAD noun], say, on the mother.)
>>>>
>>>> In Paris, I remember being convinced that the added simplicity in
>>>> defining
>>>> lexical rules would out-weigh the lack of transparency noted above.  And
>>>> I'm still pretty sure I agree with that.  One thing in favor of that view
>>>> is that if a rule
>>>> defined using the defeasible identity type didn't have the expected behavior,
>>>> the
>>>> grammar engineer could always either add constraints or side-step
>>>> that type and hand-specify all the desired identities.
>>>>
>>>> A further complication I noticed while writing out this example is the
>>>> interaction between defeasible and indefeasible identity tags.  Two
>>>> conditions
>>>> to consider:
>>>>
>>>> 1) The rule inherits a constraint (e.g., from the type of the DTR
>>>> value) that
>>>> the REST of the ARG-ST is the same as the COMPS list.
>>>> 2) The rule doesn't inherit such a constraint, but the constituent that
>>>> serves as the daughter identifies its ARG-ST.REST and its COMPS.
>>>>
>>>> I think (2) isn't a problem (this is very similar to things that
>>>> confused
>>>> Tom, Ivan and I as we designed the lex rules in the textbook, though,
>>>> so I'm
>>>> not feeling very confident just now!).  As for (1), it could entail a
>>>> similar
>>>> push down of identity inside the ARG-ST.  But what if the ARG-ST to
>>>> DTR.ARG-ST identification were a non-defeasible identity constraint?
>>>> Maybe that's just a broken grammar that either shouldn't compile or
>>>> would
>>>> just have surprising behavior.
>>>>
>>>> I hope you are still interested in this problem. Let me know if/when it
>>>> would
>>>> be useful to have a grammar to play with.
>>>>
>>>> Thanks!
>>>>
>>>> On Tue, Sep 4, 2018 at 10:42 AM, goodman.m.w at gmail.com <
>>>> goodman.m.w at gmail.com> wrote:
>>>>
>>>>> Hello everyone,
>>>>>
>>>>> I appreciate the feedback I've received in previous messages in my
>>>>> attempts to dust off neglected corners of TDL syntax, and I'd now like to
>>>>> bring up "defaults", or "defeasible constraints" (I believe these refer to
>>>>> the same thing). Are we prepared to start supporting
>>>>> defaults/defeasible-constraints in our processors and using them in our
>>>>> grammars? Or should we discard them as an undesired experimental feature
>>>>> (i.e., declare them to *not* be part of DELPH-IN TDL)?
>>>>>
>>>>> Further information:
>>>>>
>>>>> Currently, only the LKB supports them (and maybe PET?). As I
>>>>> understand, they are a compile-time feature, meaning that they change how
>>>>> the grammar is compiled and that there is no longer a notion of "defaults"
>>>>> during run-time. I don't think the use of defaults causes any change in the
>>>>> competence or performance of a grammar.
>>>>>
>>>>> The benefit of defaults is for the grammar engineer as it can reduce
>>>>> the amount of boilerplate code and make the grammar source code more
>>>>> intuitive. I think any result that makes grammar writing easier is a big
>>>>> win. The differences it creates between the source-code form of the grammar
>>>>> and the compiled hierarchy, however, can complicate debugging (e.g.,
>>>>> interactive unification).
>>>>>
>>>>> Some links:
>>>>>   - http://www.aclweb.org/anthology/J99-1002
>>>>>   - http://moin.delph-in.net/ParisDefeasibleConstraints
>>>>>   - http://moin.delph-in.net/StanfordDefaults
>>>>>
>>>>> --
>>>>> -Michael Wayne Goodman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Emily M. Bender
>>>> Professor, Department of Linguistics
>>>> University of Washington
>>>> Twitter: @emilymbender
>>>>
>>>
>>>
>
> --
> -Michael Wayne Goodman
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20180922/fef4bd4d/attachment-0001.html>