[developers] Questions on the syntax of TDL
Michael Wayne Goodman
goodmami at uw.edu
Fri Jul 13 02:46:31 CEST 2018
Thanks for your input, Stephan,
On Thu, Jul 12, 2018 at 2:31 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
> dear all,
>
> i dimply recall there have been multiple previous discussions of
> documentation strings, but so far i could only turn up one such thread
> (which appears somewhat inconclusive, but provides some useful
> reminders also related to other questions originally raised by mike);
>
> http://lists.delph-in.net/archives/developers/2007/000868.html
>
>
Thanks for linking this. This is the thread I referenced in my first
message, but I should have linked it then and saved you the trouble of
searching. I've added it to my list of discussions at the bottom of the
TdlRfc wiki.
> personally, i would think that there is complete freedom in how to
> order the various types of statements in TDL conjunctions, including
> repetition, e.g.
>
> foo := [ FOO + ] & bar & [ BAR - ] & baz.
>
This is a different takeaway than I had from Ann's 2004 email on that
thread, where she argued that supertypes have special status on types,
unlike entries/instances, where they are not in fact supertypes (as Guy
brought up in an earlier post). From p106 of Copestake 2002, discussing an
example `a := b & c.`:
> Elsewhere in the TDL description language, `&` can be taken as equivalent
to unification, but here it is understood as defining part of the type
hierarchy. It follows from this type definition that the constraint on `a`
will be the unification of the constraints on type `b` and type `c`, but
[...] we need to know what the type hierarchy is before we can talk about
unification, so the statement about the hierarchy is logically prior.
Thus, allowing supertypes and constraints to mix freely obfuscates the
distinct role in hierarchy creation that the list of supertypes has. That
is, it's a human-comprehensibility issue, not a technical one. But maybe
her perspective has shifted in the last 14 years?
In any case, I think the current implementations allow for your definition
of `foo`, and it's only convention that the supertypes appear first.
ann in that historic thread points out that it can be sensible to have
> instance definitions without explicit mention of the type; and i
> imagine the same could in principle also apply to type definitions
> (given strict appropriateness and type inference).
>
Hmm.. I'll pretend I didn't read this. :)
> so i am tempted to assume there is nothing special about the position
> of ‘parent’ types.
>
> i can see how the triple-double-quote syntax (""" ... """) would be
> position-independent, but it would seem to make lexical analysis more
> complex: currently, i believe, all TDL operators are single
> characters.
>
Let's not forget :=, :+, <!, !>, ..., #|, |#, letter-set, wild-card,
prefix, and suffix (ok, they aren't all "operators", but they are all TDL
literals that need to be parsed).
Also it now occurs to me that the triple-double-quote option is essentially
Ann's (vi): "invent yet another reserved character", which, to her, was an
acceptable option.
> i am wondering: could a simple string ever be part of the top-level
> conjunction in a type or instance definition? if so, how? if not,
> what would speak against treating all top-level TDL strings as
> documentation associated with the type or instance definition,
> independent of their position relative to other elements of a
> top-level conjunction, e.g. any of the following variants
>
> foo :=
> "a silly example" &
> bar & [ FOO + ].
>
> foo := bar &
> "a silly example" &
> [ FOO + ].
>
> foo := bar &
> [ FOO + ] &
> "a silly example".
>
> for full generality, one could then either allow a list of
> documentation strings associated with each type or instance definition
> (and have the addendum operator add to the tail of the list), or
> simply concatenate all such strings into one (presumably adding at
> least one newline between each pair of strings).
>
Isn't this Ann's option (iii), which was rejected straight away? String
literals, I think, are implicitly subtypes of *string-type*, and even if
they never unify with, say, *avm*, I don't think that means it's uncharted
territory yet to be defined, but that they should, simply, always fail to
unify. Otherwise people may be confused why this won't work:
foo := bar &
[ FOO "inner-docstring" & + ].
(note: I'm not proposing we start allowing docstrings inside structures)
However, I can imagine defining """...""" as a syntax for something like [
DOC ... ] ( just as < stands for [ FIRST ..., REST *list* ]) which then can
unify with the & operator, if we don't mind documentation being included in
the feature structures (will this take a toll on memory usage during
parsing?).
> —mike, many thanks for your work towards an up-to-date and
> consolidated definition of TDL syntax!
>
> oe
>
>
> On Thu, Jul 12, 2018 at 7:15 AM, Michael Wayne Goodman <goodmami at uw.edu>
> wrote:
> > On Wed, Jul 11, 2018 at 7:59 PM, Francis Bond <bond at ieee.org> wrote:
> >>
> >> On my phone so forgive the brevity, but in Paris we agreed that the
> >> docstring will start and end with three ". I think this is what pet
> now
> >> supports, Dan has a patch for the lkb, and I suspect I was meant to ask
> >> Woodley to add out to ACE.
> >
> >
> > Sorry I missed out on all the fun this year :(
> >
> > I see what you describe mentioned in the LTDB presentation:
> > http://users.sussex.ac.uk/~johnca/summit-2018/ltdb-update.pdf
> >
> > It seems the changes to PET are not yet merged in. Also I think Glenn
> would
> > also like to know about the agreement.
> >
> >>
> >> Comments anywhere would be great.
> >>
> >> Thanks for pushing this forward Michael.
> >>
> >> On Thu, 12 Jul 2018, 12:34 Michael Wayne Goodman, <goodmami at uw.edu>
> wrote:
> >>>
> >>>
> >>> On Wed, Jul 11, 2018, 19:24 Woodley Packard <sweaglesw at sweaglesw.org>
> >>> wrote:
> >>>>
> >>>> 2 cents worth...
> >>>>
> >>>> 1. I think the logical behavior for an addendum with a doc string is
> >>>> concatenation, not replacement. The doc string on the addendum should
> >>>> document what the addendum adds to the type, not the whole type.
> >>>
> >>>
> >>> Hmm, good point. I guess the addendum can only add constraints, so the
> >>> old docstring wouldn't necessarily become invalid. I was comparing to
> method
> >>> overrides in Python classes, but that's not a useful comparison.
> >>>
> >>>>
> >>>> 2. ACE (I believe) allows comments just about anywhere in TDL. I find
> >>>> this very useful when editing TDL, e.g. annotating changes on a fine
> grained
> >>>> level or disabling certain constraints temporarily without deleting
> them.
> >>>
> >>>
> >>> Yes, it's definitely more useful to allow comments almost anywhere, and
> >>> not really hard to parse, either.
> >>>
> >>>> Regards,
> >>>> Woodley
> >>>>
> >>>> On Jul 11, 2018, at 5:00 PM, Michael Wayne Goodman <goodmami at uw.edu>
> >>>> wrote:
> >>>>
> >>>> Thank you, Bernd, for the feedback. But I'm not having success parsing
> >>>> types with docstrings using PET. E.g., I changed sign-min in the ERG
> like
> >>>> this:
> >>>>
> >>>> sign_min := *avm* &
> >>>> "doc"
> >>>> [ SYNSEM synsem_min,
> >>>> KEY-ARG bool ].
> >>>>
> >>>> But flop doesn't like it:
> >>>>
> >>>> goodmami at tpy:~/grammars/erg$ flop english.tdl
> >>>> reading `Version.lsp'...
> >>>> converting `english.tdl' (ERG (1214)) into `english.grm' ...
> >>>> loading `english.tdl'... including `fundamentals.tdl'...
> >>>> fundamentals.tdl:21:3: error: (syntax) - got ` [',
> >>>> expecting `.' at end of type definition
> >>>> [...]
> >>>>
> >>>> I get similar errors no matter where I put it (before :=, directly
> after
> >>>> :=, after ]). It's syntactically valid if I have ... *avm* & "doc" &
> ...,
> >>>> but then it has trouble unifying (as expected).
> >>>>
> >>>> It does, however, seem to be happy having a comment there (both ; and
> #|
> >>>> styles) instead of a doc string.
> >>>>
> >>>> I'm using flop version 0.99.14svn_cm from the LOGON distribution.
> >>>>
> >>>> On Wed, Jul 11, 2018 at 7:00 AM, Bernd Kiefer <Bernd.Kiefer at dfki.de>
> >>>> wrote:
> >>>>>
> >>>>> Concerning question 3, at least in TDL and PET there was no such
> >>>>> restriction,
> >>>>> but that could make the definition of docstrings easier.
> >>>>>
> >>>>> Best,
> >>>>>
> >>>>> Bernd
> >>>>>
> >>>>> On 11.07.2018 03:01, Michael Wayne Goodman wrote:
> >>>>>
> >>>>> I attempted to define a BNF-like description of TDL syntax on the
> wiki:
> >>>>> http://moin.delph-in.net/TdlRfc
> >>>>> I tried to follow the partial BNF in the LKB source and often
> referred
> >>>>> to the lisp code itself in order to fill out the rest of the
> description.
> >>>>>
> >>>>> My 3 questions above are concisely repeated at the bottom of the wiki
> >>>>> along with some others.
> >>>>>
> >>>>> I welcome corrections and discussion (here or on the wiki) from any
> TDL
> >>>>> nerds or authorities (especially if you've written a TDL parser).
> >>>>>
> >>>>> On Mon, Jul 9, 2018 at 12:49 PM, Michael Wayne Goodman
> >>>>> <goodmami at uw.edu> wrote:
> >>>>>>
> >>>>>> Hi developers,
> >>>>>>
> >>>>>> I'm taking a closer look at the syntax of TDL files and the
> situation
> >>>>>> is a bit of a mess. Can anyone help me clarify some things? (I'll
> restrict
> >>>>>> myself to 3 questions for now)
> >>>>>>
> >>>>>> The Copestake 2002 reference (Implementing TFS Grammars) has a BNF
> for
> >>>>>> TDL, but it's a bit out of date and, according to comments in the
> LKB source
> >>>>>> code, incorrect in parts. The LKB source comments are scattered,
> incomplete,
> >>>>>> inconsistent, and also a bit outdated. There is not much on the
> wiki. There
> >>>>>> is some discussion in the mailing list archives (much from before
> my time in
> >>>>>> DELPH-IN), but it's not clear how current those descriptions are.
> >>>>>>
> >>>>>> Q1: Are supertypes special in a definition?
> >>>>>>
> >>>>>> The BNF (in the LKB source) says this:
> >>>>>>
> >>>>>> Type-def -> Type { Avm-def | Subtype-def} . |
> >>>>>> Type { Avm-def | Subtype-def}.
> >>>>>> Avm-def -> := Conjunction | Comment Conjunction
> >>>>>> Conjunction -> Term { & Term } *
> >>>>>> Term -> Type | Feature-term | Diff-list | List | Coreference
> >>>>>>
> >>>>>> That makes it sound like I could do this:
> >>>>>>
> >>>>>> mytype := [ FEAT val ] & supertype.
> >>>>>>
> >>>>>> or even:
> >>>>>>
> >>>>>> mytype := <! diff list.. !> & #coref & supertype.
> >>>>>>
> >>>>>> But elsewhere it seems like a list of parents is special and appears
> >>>>>> before the rest of the conjunction. E.g., at read-tdl-avm-def of
> >>>>>> lingo/lkb/src/io-tdl/tdltypeinput.lsp I see this alternate
> definition of
> >>>>>> Avm-def:
> >>>>>>
> >>>>>> ;;; Avm-def -> := Parents Conjunction | Parents Comment
> Conjunction
> >>>>>> |
> >>>>>> ;;; Parents | Parents Comment
> >>>>>>
> >>>>>> It seems that both ACE and PET are fine with putting supertypes
> after
> >>>>>> the feature list (and some other variations). I'm fine with this,
> but I
> >>>>>> wonder what it means for docstrings (see Q3 below), which (I think)
> are
> >>>>>> supposed to appear after the list of parents and before the feature
> list.
> >>>>>>
> >>>>>>
> >>>>>> Q2: Subtype-def is now just a variant of Avm-def, yes?
> >>>>>>
> >>>>>> The BNF still describes subtyping (with the :< operator) as only
> >>>>>> taking a single parent:
> >>>>>>
> >>>>>> Subtype-def -> :< type
> >>>>>>
> >>>>>> But I believe the consensus is that this is unnecessary (it's
> >>>>>> equivalent to using := with only a supertype), so :< is treated as
> >>>>>> equivalent to := (to avoid breaking backward compatibility). Is this
> >>>>>> interpretation used by all processors?
> >>>>>>
> >>>>>>
> >>>>>> Q3: What's the final word with type comments / docstrings?
> >>>>>>
> >>>>>> I find evidence of 3 proposed variants: (1) a block of ";" comments
> >>>>>> before a typename (LTDB-style); (2) a block of ";" comments within
> a type
> >>>>>> description; and (3) a "doc string" within a type description.
> Furthermore,
> >>>>>> there is a question as to whether comments or strings within a type
> go after
> >>>>>> the ":=" or after the list of supertypes. I think #| ... |#
> comments were
> >>>>>> not considered for this purpose.
> >>>>>>
> >>>>>> My guess is this:
> >>>>>>
> >>>>>> * LTDB-style comments (before the type identifier) are processed
> >>>>>> separately from TDL-parsing
> >>>>>> * type-internal comments can go anywhere but are discarded
> >>>>>> * type-internal doc strings must appear after the list of supertypes
> >>>>>> and are later available for inspection (they are included as a
> >>>>>> non-functional part of a type)
> >>>>>>
> >>>>>> ACE seems happy with my assumptions, although PET doesn't seem to
> like
> >>>>>> doc strings at all.
> >>>>>>
> >>>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> --
> >>>>>> Michael Wayne Goodman
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Michael Wayne Goodman
> >>>>>
> >>>>>
> >>>>> --
> >>>>> ------------------------------------------------------------
> ----------
> >>>>> Bernd Kiefer DFKI GmbH, Stuhlsatzenhausweg, D-66123
> Saarbruecken
> >>>>> kiefer at dfki.de +49-681/85775-5301 (phone) +49-681/85775-5338
> (fax)
> >>>>> ------------------------------------------------------------
> ----------
> >>>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> >>>>> Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
> >>>>> Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vor-
> >>>>> sitzender), Dr. Walter Olthoff
> >>>>> Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
> >>>>> Amtsgericht Kaiserslautern, HRB 2313
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Michael Wayne Goodman
> >
> >
> >
> >
> > --
> > Michael Wayne Goodman
>
--
Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20180712/c3ca7379/attachment-0001.html>
More information about the developers
mailing list