[developers] Questions on the syntax of TDL

Francis Bond bond at ieee.org
Fri Jul 13 02:06:33 CEST 2018


G'day,

Part of the motivation for the """ syntax was to allow double quotes
in the doc strings, the current implementation (at least in the lkb)
did not allow for them, even quoted \".   I think none of the
developers have expressed any issues with having a multi-character TDL
operator, so perhaps in practice it is not a big issue?

Luis and I have been rebuilding the Linguistic Type Data-Base, and
would very much like to use the doc-strings, we could live with a
single " (and support for \") or with a list of strings (again with
support for \") but would mildly prefer """ for aesthetic reasons.   I
think Bec has a patch for """ for PET, and Dan for the lkb (although
only for types and not instances if I recall correctly).   We have
been discussing this for over 10 years, it would be really nice to
actually have something available this year :-).



On Fri, Jul 13, 2018 at 5:31 AM, Stephan Oepen <oe at ifi.uio.no> wrote:
> dear all,
>
> i dimply recall there have been multiple previous discussions of
> documentation strings, but so far i could only turn up one such thread
> (which appears somewhat inconclusive, but provides some useful
> reminders also related to other questions originally raised by mike);
>
> http://lists.delph-in.net/archives/developers/2007/000868.html
>
> personally, i would think that there is complete freedom in how to
> order the various types of statements in TDL conjunctions, including
> repetition, e.g.
>
> foo := [ FOO + ] & bar & [ BAR - ] & baz.
>
> ann in that historic thread points out that it can be sensible to have
> instance definitions without explicit mention of the type; and i
> imagine the same could in principle also apply to type definitions
> (given strict appropriateness and type inference).
>
> so i am tempted to assume there is nothing special about the position
> of ‘parent’ types.
>
> i can see how the triple-double-quote syntax (""" ... """) would be
> position-independent, but it would seem to make lexical analysis more
> complex: currently, i believe, all TDL operators are single
> characters.
>
> i am wondering: could a simple string ever be part of the top-level
> conjunction in a type or instance definition?  if so, how?  if not,
> what would speak against treating all top-level TDL strings as
> documentation associated with the type or instance definition,
> independent of their position relative to other elements of a
> top-level conjunction, e.g. any of the following variants
>
> foo :=
> "a silly example" &
> bar & [ FOO + ].
>
> foo := bar &
> "a silly example" &
> [ FOO + ].
>
> foo := bar &
> [ FOO + ] &
> "a silly example".
>
> for full generality, one could then either allow a list of
> documentation strings associated with each type or instance definition
> (and have the addendum operator add to the tail of the list), or
> simply concatenate all such strings into one (presumably adding at
> least one newline between each pair of strings).
>
> —mike, many thanks for your work towards an up-to-date and
> consolidated definition of TDL syntax!
>
> oe
>
>
> On Thu, Jul 12, 2018 at 7:15 AM, Michael Wayne Goodman <goodmami at uw.edu> wrote:
>> On Wed, Jul 11, 2018 at 7:59 PM, Francis Bond <bond at ieee.org> wrote:
>>>
>>> On my phone so forgive the brevity, but in Paris we agreed that the
>>> docstring will start and end with three ".   I think this is what pet now
>>> supports, Dan has a patch for the lkb, and I suspect I was meant to ask
>>> Woodley to add out to ACE.
>>
>>
>> Sorry I missed out on all the fun this year :(
>>
>> I see what you describe mentioned in the LTDB presentation:
>> http://users.sussex.ac.uk/~johnca/summit-2018/ltdb-update.pdf
>>
>> It seems the changes to PET are not yet merged in. Also I think Glenn would
>> also like to know about the agreement.
>>
>>>
>>> Comments anywhere would be great.
>>>
>>> Thanks for pushing this forward Michael.
>>>
>>> On Thu, 12 Jul 2018, 12:34 Michael Wayne Goodman, <goodmami at uw.edu> wrote:
>>>>
>>>>
>>>> On Wed, Jul 11, 2018, 19:24 Woodley Packard <sweaglesw at sweaglesw.org>
>>>> wrote:
>>>>>
>>>>> 2 cents worth...
>>>>>
>>>>> 1. I think the logical behavior for an addendum with a doc string is
>>>>> concatenation, not replacement.  The doc string on the addendum should
>>>>> document what the addendum adds to the type, not the whole type.
>>>>
>>>>
>>>> Hmm, good point. I guess the addendum can only add constraints, so the
>>>> old docstring wouldn't necessarily become invalid. I was comparing to method
>>>> overrides in Python classes, but that's not a useful comparison.
>>>>
>>>>>
>>>>> 2. ACE (I believe) allows comments just about anywhere in TDL.  I find
>>>>> this very useful when editing TDL, e.g. annotating changes on a fine grained
>>>>> level or disabling certain constraints temporarily without deleting them.
>>>>
>>>>
>>>> Yes, it's definitely more useful to allow comments almost anywhere, and
>>>> not really hard to parse, either.
>>>>
>>>>> Regards,
>>>>> Woodley
>>>>>
>>>>> On Jul 11, 2018, at 5:00 PM, Michael Wayne Goodman <goodmami at uw.edu>
>>>>> wrote:
>>>>>
>>>>> Thank you, Bernd, for the feedback. But I'm not having success parsing
>>>>> types with docstrings using PET. E.g., I changed sign-min in the ERG like
>>>>> this:
>>>>>
>>>>>     sign_min := *avm* &
>>>>>       "doc"
>>>>>       [ SYNSEM synsem_min,
>>>>>         KEY-ARG bool ].
>>>>>
>>>>> But flop doesn't like it:
>>>>>
>>>>>     goodmami at tpy:~/grammars/erg$ flop english.tdl
>>>>>     reading `Version.lsp'...
>>>>>     converting `english.tdl' (ERG (1214)) into `english.grm' ...
>>>>>     loading `english.tdl'... including `fundamentals.tdl'...
>>>>> fundamentals.tdl:21:3: error: (syntax) - got `                   [',
>>>>> expecting `.' at end of type definition
>>>>>     [...]
>>>>>
>>>>> I get similar errors no matter where I put it (before :=, directly after
>>>>> :=, after ]). It's syntactically valid if I have ... *avm* & "doc" & ...,
>>>>> but then it has trouble unifying (as expected).
>>>>>
>>>>> It does, however, seem to be happy having a comment there (both ; and #|
>>>>> styles) instead of a doc string.
>>>>>
>>>>> I'm using flop version 0.99.14svn_cm from the LOGON distribution.
>>>>>
>>>>> On Wed, Jul 11, 2018 at 7:00 AM, Bernd Kiefer <Bernd.Kiefer at dfki.de>
>>>>> wrote:
>>>>>>
>>>>>> Concerning question 3, at least in TDL and PET there was no such
>>>>>> restriction,
>>>>>> but that could make the definition of docstrings easier.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Bernd
>>>>>>
>>>>>> On 11.07.2018 03:01, Michael Wayne Goodman wrote:
>>>>>>
>>>>>> I attempted to define a BNF-like description of TDL syntax on the wiki:
>>>>>> http://moin.delph-in.net/TdlRfc
>>>>>> I tried to follow the partial BNF in the LKB source and often referred
>>>>>> to the lisp code itself in order to fill out the rest of the description.
>>>>>>
>>>>>> My 3 questions above are concisely repeated at the bottom of the wiki
>>>>>> along with some others.
>>>>>>
>>>>>> I welcome corrections and discussion (here or on the wiki) from any TDL
>>>>>> nerds or authorities (especially if you've written a TDL parser).
>>>>>>
>>>>>> On Mon, Jul 9, 2018 at 12:49 PM, Michael Wayne Goodman
>>>>>> <goodmami at uw.edu> wrote:
>>>>>>>
>>>>>>> Hi developers,
>>>>>>>
>>>>>>> I'm taking a closer look at the syntax of TDL files and the situation
>>>>>>> is a bit of a mess. Can anyone help me clarify some things? (I'll restrict
>>>>>>> myself to 3 questions for now)
>>>>>>>
>>>>>>> The Copestake 2002 reference (Implementing TFS Grammars) has a BNF for
>>>>>>> TDL, but it's a bit out of date and, according to comments in the LKB source
>>>>>>> code, incorrect in parts. The LKB source comments are scattered, incomplete,
>>>>>>> inconsistent, and also a bit outdated. There is not much on the wiki. There
>>>>>>> is some discussion in the mailing list archives (much from before my time in
>>>>>>> DELPH-IN), but it's not clear how current those descriptions are.
>>>>>>>
>>>>>>> Q1: Are supertypes special in a definition?
>>>>>>>
>>>>>>> The BNF (in the LKB source) says this:
>>>>>>>
>>>>>>>     Type-def -> Type { Avm-def | Subtype-def} . |
>>>>>>>                          Type { Avm-def | Subtype-def}.
>>>>>>>     Avm-def -> := Conjunction | Comment Conjunction
>>>>>>>     Conjunction -> Term { & Term } *
>>>>>>>     Term -> Type | Feature-term | Diff-list | List | Coreference
>>>>>>>
>>>>>>> That makes it sound like I could do this:
>>>>>>>
>>>>>>>     mytype := [ FEAT val ] & supertype.
>>>>>>>
>>>>>>> or even:
>>>>>>>
>>>>>>>     mytype := <! diff list.. !> & #coref & supertype.
>>>>>>>
>>>>>>> But elsewhere it seems like a list of parents is special and appears
>>>>>>> before the rest of the conjunction. E.g., at read-tdl-avm-def of
>>>>>>> lingo/lkb/src/io-tdl/tdltypeinput.lsp I see this alternate definition of
>>>>>>> Avm-def:
>>>>>>>
>>>>>>>   ;;; Avm-def -> := Parents Conjunction | Parents Comment Conjunction
>>>>>>> |
>>>>>>>   ;;;               Parents | Parents Comment
>>>>>>>
>>>>>>> It seems that both ACE and PET are fine with putting supertypes after
>>>>>>> the feature list (and some other variations). I'm fine with this, but I
>>>>>>> wonder what it means for docstrings (see Q3 below), which (I think) are
>>>>>>> supposed to appear after the list of parents and before the feature list.
>>>>>>>
>>>>>>>
>>>>>>> Q2: Subtype-def is now just a variant of Avm-def, yes?
>>>>>>>
>>>>>>> The BNF still describes subtyping (with the :< operator) as only
>>>>>>> taking a single parent:
>>>>>>>
>>>>>>>     Subtype-def ->  :< type
>>>>>>>
>>>>>>> But I believe the consensus is that this is unnecessary (it's
>>>>>>> equivalent to using := with only a supertype), so :< is treated as
>>>>>>> equivalent to := (to avoid breaking backward compatibility). Is this
>>>>>>> interpretation used by all processors?
>>>>>>>
>>>>>>>
>>>>>>> Q3: What's the final word with type comments / docstrings?
>>>>>>>
>>>>>>> I find evidence of 3 proposed variants: (1) a block of ";" comments
>>>>>>> before a typename (LTDB-style); (2) a block of ";" comments within a type
>>>>>>> description; and (3) a "doc string" within a type description. Furthermore,
>>>>>>> there is a question as to whether comments or strings within a type go after
>>>>>>> the ":=" or after the list of supertypes. I think #| ... |# comments were
>>>>>>> not considered for this purpose.
>>>>>>>
>>>>>>> My guess is this:
>>>>>>>
>>>>>>> * LTDB-style comments (before the type identifier) are processed
>>>>>>> separately from TDL-parsing
>>>>>>> * type-internal comments can go anywhere but are discarded
>>>>>>> * type-internal doc strings must appear after the list of supertypes
>>>>>>> and are later available for inspection (they are included as a
>>>>>>> non-functional part of a type)
>>>>>>>
>>>>>>> ACE seems happy with my assumptions, although PET doesn't seem to like
>>>>>>> doc strings at all.
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> --
>>>>>>> Michael Wayne Goodman
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Michael Wayne Goodman
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ----------------------------------------------------------------------
>>>>>> Bernd Kiefer     DFKI GmbH,  Stuhlsatzenhausweg,  D-66123 Saarbruecken
>>>>>> kiefer at dfki.de   +49-681/85775-5301 (phone)   +49-681/85775-5338 (fax)
>>>>>> ----------------------------------------------------------------------
>>>>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>>>>>> Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
>>>>>> Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vor-
>>>>>>                     sitzender), Dr. Walter Olthoff
>>>>>> Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
>>>>>> Amtsgericht Kaiserslautern, HRB 2313
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Michael Wayne Goodman
>>
>>
>>
>>
>> --
>> Michael Wayne Goodman



-- 
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University



More information about the developers mailing list