[developers] doc-strings in TDL for DELPH-IN
goodman.m.w at gmail.com
goodman.m.w at gmail.com
Tue Aug 28 02:53:39 CEST 2018
Hi Dan (and all),
I updated the TdlRfc wiki on August 7 to reflect the decisions we made, but
I hadn't yet worked on the implementation until today, coincidentally, as I
had been working on the v0.8.0 release with Angie. I should have a working
version in PyDelphin's 'develop' branch soon, so hold off on testing you
see the relevant commit (or subscribe to this issue to be notified when
it's fixed: https://github.com/delph-in/pydelphin/issues/167).
Alternatively, if you commit the docstrings to ERG's repo, then I can test
it myself and report back.
On Mon, Aug 27, 2018 at 3:23 PM Dan Flickinger <danf at stanford.edu> wrote:
> Hi all,
> Thanks to the developers who have coordinated efforts so smoothly for
> these doc strings in TDL. I am happy to report that for each of the
> following platforms, I have been able to successfully (compile and) load
> and run the trunk version of the ERG suitably modified to use the new
> doc-string format for comments on all of the 1250 leaf lexical types in the
> LKB and LKB_FOS -- Using the latest versions available for download from
> the DELPH-IN LkbInstallation pages, both still require the ERG-supplied
> patch to the function read-tdl-type-comment() in erg/lkb/patches.lsp, but
> with this patch, the grammar with new doc-strings loads and runs fine. It
> would be nice to have developer-approved versions of this function instead
> of the patch, since any other grammar employing these new doc-strings will
> also currently need to include this patched version of the function.
> PET -- Using the latest `main' SVN branch, the updated grammar compiles,
> loads, and runs fine, with one surprising caveat: for some reason, two of
> the three files containing the new doc-strings (letypes.tdl and
> auxverbs.tdl) will not compile unless they include a final commented-out
> line with a particular number of characters. See the note at the end of
> each of these files; it would be nice to chase down and correct this
> hiccup, though it might not be urgent. Other grammar developers should
> monitor the behavior with their grammars in the meantime.
> ACE -- Using the `trunk' SVN version, the updated grammar compiles, loads,
> and runs fine. It would be good to now update the precompiled ACE binary
> on the ACE home page, so the ERG (and possibly other updated grammars) will
> I haven't yet checked to see if the latest PyDelphin is happy with this
> version, but will soon, unless Mike or Angie get there first (once I check
> in the ERG changes). I also don't know whether `agree' is ready to accept
> the new doc-strings.
> Next steps:
> 1. I will check in the updated `trunk' ERG, and hope that the ACE
> binary on that home page will be updated soon, for those who might be using
> the trunk ERG but not compiling their own ACE.
> 2. It would be good to have the $LOGONROOT/uio/bin/linux.x86.64
> binaries for `flop' and `cheap' updated to be consistent with the `main'
> branch of PET, since the existing ones cannot compile what I will check
> into the trunk ERG (soon to be stable version "2018").
> I'll hold off for a day in checking in the new ERG, in case anyone can
> foresee a reason for a different sequence of events to get us to a happier
> future consistent state of the world that embraces the new doc-strings.
> *From:* goodman.m.w at gmail.com <goodman.m.w at gmail.com>
> *Sent:* Tuesday, August 7, 2018 10:59 AM
> *To:* Woodley Packard
> *Cc:* Dan Flickinger; Francis Bond; Delph-in developers list
> *Subject:* Re: [developers] doc-strings in TDL for DELPH-IN
> Thanks, Dan,
> I'll do my part to update the wiki with the preferred syntax and add
> support into PyDelphin. Regarding the syntax description, it would be
> rather complicated to enforce one docstring per type in the production
> rules if it's not in a fixed position, so I'll let it accept multiple per
> type and just make a note for implementers that only the first one must be
> preserved (with the action for additional docstrings left undefined).
> And, Woodley, good catch on the regex bug. Those patterns should be
> negative lookahead assertions. I think the following works:
> DocString := /"""([^"\\]|\\.|"(?!")|""(?!"))*"""/
> Lookahead assertions can slow down regex searches, so this pattern is
> intended to be illustrative; a non-regex parser is fine as long as it also
> allows escaped characters (including quotes) and up to two unescaped quotes
> not followed by a third quote. Also, if it's not clear from the pattern,
> newlines are acceptable within the triple-quoted strings.
> On Tue, Aug 7, 2018 at 12:28 AM Woodley Packard <sweaglesw at sweaglesw.org>
> Hello docstringers,
> I have added to ACE the ability to detect and ignore triple-quoted strings
> anywhere within a TDL statement. I will leave it to others to determine
> and police legal placement. The (very lightly tested) update is available
> in the ACE SVN trunk for those who wish to test it. I will be happy to
> make a binary release soon if bugs are not uncovered.
> I have one nit to pick with the proposed regular expression for doc
> strings. The following docstring would be treated as terminating early,
> since the backslash is gobbled up without being interpreted:
> """hello"\"""not done yet"""
> This one is legal in python (and handled properly by ACE :-)).
> On Aug 3, 2018, at 8:57 PM, Francis Bond <bond at ieee.org> wrote:
> DocString := /"""([^"\\]|\\.|"[^"]|""[^"])*"""/ Spacing
> -Michael Wayne Goodman
-Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the developers