[developers] More TDL cobwebs

goodman.m.w at gmail.com goodman.m.w at gmail.com
Mon Oct 8 00:32:36 CEST 2018


Hi all,

Francis suggested I include :begin ... :end blocks in PyDelphin's TDL
parsing but first I need them defined. They are similar to Krieger &
Schäfer 1994's description of "environments", but appear to be a very
restricted subset (only :type and :instance) with a few syntactic
differences (e.g., we use more colons, such as ":begin", ":include", and
":end" instead of "begin", "include", and "end", and we allow ":status"
specifications after :instance). I've attempted to describe the format on
the wiki. Here's a link to the diff (and note that I already fixed the
"DOT" bug): http://moin.delph-in.net/TdlRfc?action=diff&rev1=15&rev2=16.

If anyone has any knowledge or opinions about these environments, could
they please check the description on the wiki and see if it's accurate?

Also, I added a clarification about #| block-comments |#;. In Lisp they can
nest, and ACE seems to allow nesting, but the LKB does not, so I added a
comment saying that nesting is not fully supported.


On Mon, Sep 10, 2018 at 6:44 AM Emily M. Bender <ebender at uw.edu> wrote:

> Just for the record: Knoppix+LKB has been superseded by Ubuntu+LKB and we
> do keep it updated:
>
> https://wiki.ling.washington.edu/bin/view.cgi/Main/KnoppixLKB
>
> Emily
>
> On Sat, Sep 8, 2018 at 4:57 AM, John Carroll <J.A.Carroll at sussex.ac.uk>
> wrote:
>
>> Hi,
>>
>> Thanks for trying to fix the LKB.
>>
>> I think your TDL clean-ups are a very good idea. The new version of
>> read-tdl-type-comment in patches.lsp will indeed eventually make it into
>> the LKB proper. But I was concerned about not being able to patch existing
>> LKB binaries effectively. When I referred to backward compatibility, I was
>> thinking about LKB binaries in distributions that may never get updated,
>> e.g. http://www.cs.upc.edu/~padro/docker-logon.tgz and Knoppix+LKB .
>> This might not be too much of a problem  in practice except that some LKB
>> error messages are poor or misleading.
>>
>> I'll have a go at making a minimal set of changes that could be put in a
>> patch file, and add a more considered reimplementation of TDL reading to my
>> todo list.
>>
>> John
>>
>> On 8 Sep 2018, at 00:09, goodman.m.w at gmail.com wrote:
>>
>> Hi again,
>>
>> I spent an hour or two editing patches.lsp to try and make it work, but
>> my lisp writing and debugging knowledge is too limited to figure it out
>> right now. Here's what I tried to do:
>>
>> * read-tdl-top-conjunction:
>>   - a copy of read-tdl-conjunction, except for the following...
>>   - call read-tdl-type-comment if peek-with-comments returns " before
>> calling read-tdl-defterm
>>   - append the pair (docstring . term) to the "constraint" variable
>> instead of just term
>> * read-tdl-avm-def:
>>   - remove the part about reading parents
>>   - expect a pair (docstring . term) from read-tdl-top-conjunction
>>   - append the docstring to the "comment" variable
>>   - extract the term as "unif" and proceeds as before
>> * read-tdl-type-comment:
>>   - if it doesn't encounter """, it calls unread-char to put those quotes
>> back on the stream, because it may be a regular "string" or empty "" string
>>   - don't print an error if the string doesn't start with """
>>
>> I only created read-tdl-top-conjunction so that I didn't have to redefine
>> all the other places where read-tdl-conjunction was used. Trying to load
>> the ERG with these changes gives me an "Unexpected unif" error when it
>> tries to load fundamentals.tdl.
>>
>> On Fri, Sep 7, 2018 at 11:59 AM goodman.m.w at gmail.com <
>> goodman.m.w at gmail.com> wrote:
>>
>>> Thanks for the feedback, John,
>>>
>>> While I appreciate your arguments and code, I am reluctant to agree with
>>> any changes now. The LKB has been a pioneer in allowing docstrings, but I
>>> don't think we should revert the work other developers have put into their
>>> processors in the last month, not to mention the hard-earned consensus over
>>> the color of this bike shed. Here are my reasons:
>>>
>>> 1. The agreed-upon syntax does not break backward compatibility (except
>>> regarding the number of quote characters), it only opens up new places
>>> where docstrings may occur (see (3))
>>>
>>> 2. The lack of support for docstrings outside of the LKB hindered their
>>> adoption, so backward compatibility isn't much of an issue given that
>>> grammar developers avoided using them (given this, maybe I should have
>>> pushed harder for docstrings immediately after := or :+... oh well).
>>>
>>> 3. The LKB's implementation that parses supertypes (or "parents" as used
>>> in the lisp code) before other terms is only half-baked. It first reads
>>> some type names, then looks for a docstring, then reads other terms, which
>>> may include more type names. I proposed making a change to the syntax so
>>> that type names must appear before other terms in a top-level conjunction,
>>> but the only replies I got addressing this point (from Stephan and Dan)
>>> opposed such a change. Thus, we agreed that type names have no special
>>> position in conjunctions. Because of this, saying that the docstring must
>>> occur before the AVM means little, because (a) the AVM may appear before a
>>> type name, and (b) there may be more than one AVM. For instance, the LKB
>>> (with the ERG's triple-quoted patch) currently accepts these:
>>>
>>>     a := b & c """doc""".
>>>     a := b & """doc""" c.
>>>     a := b & c & """doc""" [ Q r ].
>>>     a := b & """doc""" c & [ Q r ].
>>>     a := b & """doc""" [ Q r ] & c.
>>>
>>> but not these:
>>>
>>>     a := """doc""" b & c.
>>>     a := """doc""" b & c & [ Q r ].
>>>     a := b & c & [ Q r ] """doc""".
>>>
>>> Furthermore, it accepts:
>>>
>>>     a := b & c & [ Q r ].
>>>     a := b & [ Q r ] & c.
>>>
>>> but not:
>>>
>>>     a := [ Q r ] & b & c.
>>>
>>> I imagine a grammar developer (who doesn't browse the lisp code) would
>>> not find these facts consistent. It should either enforce that all
>>> supertypes appear before other terms, or allow them to mix freely.
>>>
>>> So, on the one hand, I think that the LKB is currently deficient WRT the
>>> above patterns (which are all allowed, according to current consensus). I
>>> may take a look at fixing the Lisp code, but it would take me a while. On
>>> the other hand, the LKB merely enforces the conventional layout of TDL
>>> definitions, so it is unlikely to cause problems for now.
>>>
>>> Finally, docstrings are desired for more than just the ERG, so the
>>> temporary solution in patches.lsp should eventually make it into the LKB
>>> proper. For instance, the read-tdl-avm-def and read-tdl-conjunction
>>> functions would need some changes and the read-tdl-type-parents function
>>> should probably just be removed.
>>>
>>> On Fri, Sep 7, 2018 at 4:58 AM John Carroll <J.A.Carroll at sussex.ac.uk>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I've been looking at TDL reading in the LKB, and (partly for pragmatic
>>>> reasons) I suggest restricting docstrings to occur only in the position
>>>> immediately preceding the AVM - or just before the final . terminator if
>>>> there is no AVM. Here are my reasons:
>>>>
>>>> 1. The LKB currently only allows docstrings in that position, and
>>>> changing this while retaining backward compatibility would require an
>>>> unreasonable amount of patching in a grammar lkb/patches.lsp file
>>>> 2. This position is analogous to where docstrings are allowed in
>>>> programming languages / docstring packages
>>>>
>>>> In the hope that this is acceptable, at least for the time being, I've
>>>> sent Dan a new version of his patch to change docstrings from double-quoted
>>>> to triple double-quoted in the LKB. The patch is attached in case other
>>>> grammar developers want to pick it up.
>>>>
>>>> John
>>>>
>>>> On 7 Sep 2018, at 00:29, goodman.m.w at gmail.com wrote:
>>>>
>>>> Hi all,
>>>>
>>>> There are some remaining issues with TDL that I'd like to clean up.
>>>> First I will summarize some decisions made (or at least not rejected) in
>>>> previous email threads:
>>>>
>>>> 1. Supertypes appear before other terms in a conjunction only by
>>>> convention (not enforced in the syntax)
>>>> 2. Docstrings are triple-quoted and may appear before any top-level
>>>> term or before the final . terminator
>>>> 3. Comments may appear in definitions anywhere that spaces can, except
>>>> within strings/regexes/affixing-patterns
>>>>
>>>> The following changes are things I think people agree with, so I'd like
>>>> to consider them as decided:
>>>>
>>>> 4. Removal of the :< operator (if accepted as a variant of :=, throw a
>>>> warning)
>>>> 5. Removal of 'single-quoted-symbols
>>>> 6. Removal of double-quoted "docstrings"
>>>> 7. Removal of non-regex uses of ^ (otherwise any BNF of TDL is
>>>> necessarily incomplete because the "extended-syntax" use of ^ is open-ended)
>>>>
>>>> And there's at least one point I don't think we reached a decision on:
>>>>
>>>> 8. Instances must have exactly 1 "supertype" (which is really just a
>>>> type and not a supertype, i.e., it doesn't change the type hierarchy)
>>>>
>>>> Also:
>>>>
>>>> 9. Does anyone know how wild-cards differ from letter-sets? I see HaG
>>>> has a wild-card and suffix pattern like these:
>>>>
>>>>     %(wild-card (?g ui))
>>>>     ...
>>>>     %suffix (!c!v !c!vn) (!v?g !vn)
>>>> My guess is that wild-cards match but are not used in the replacement,
>>>> which I can imagine is useful if you want the replacement to use the second
>>>> of two matches but not the first. It makes me wonder why we don't just use
>>>> regex substitutions for these things.
>>>>
>>>> If nobody responds about (1)--(7), I'll make sure the syntax
>>>> description on the TdlRfc wiki reflects those decisions.
>>>>
>>>> --
>>>> -Michael Wayne Goodman
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> -Michael Wayne Goodman
>>>
>>
>>
>> --
>> -Michael Wayne Goodman
>>
>>
>>
>
>
> --
> Emily M. Bender
> Professor, Department of Linguistics
> University of Washington
> Twitter: @emilymbender
>


-- 
-Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20181007/5c8f1ebf/attachment-0001.html>


More information about the developers mailing list