[developers] Command line usage of the LKB's REPP

Michael Wayne Goodman goodmami at uw.edu
Thu Apr 26 23:29:12 CEST 2018


Thanks for confirming. So far I am finding that the C++ version does indeed
return better characterization than the other ones I've tried, although
some corner cases with inserted characters are surprising.

Also, the C++ version uses PCRE regexes for matching but apparently not for
substitution. It only substitutes numbered groups (e.g., \1, \2, etc.), and
not named capture groups (e.g., \g{name}, \g<name>, \k<name>, etc.), or
numbered groups over 9 (e.g., \g{10}). I'm not suggesting that the code
needs to be fixed, but the limitation may need to be documented, e.g., on
the wiki (which I can do once the limitation is confirmed).

On Thu, Apr 26, 2018 at 2:09 PM, Stephan Oepen <oe at ifi.uio.no> wrote:

> the lisp implementation is very much used, but i would put my money on the
> C++ version regarding correct characterization, if there were disagreement
> in corner cases.
>
> oe
>
>
> On Thu, 26 Apr 2018 at 22:46 Michael Wayne Goodman <goodmami at uw.edu>
> wrote:
>
>> Thank you for the advice, Stephan. For context, I'm comparing the REPP
>> implementations I know about (Lisp (LKB), C++ (PET), C (ACE), C# (agree))
>> in order to inform the design of my own Python implementation. Would you
>> consider the Lisp implementation to be abandoned or deprecated, or is it
>> perhaps still used by the LKB?
>>
>> On Thu, Apr 26, 2018 at 1:30 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
>>
>>> hi mike,
>>>
>>> i would strongly advise you use the C++ implementation of REPP as your
>>> reference.  it implements the right way of determining character ranges
>>> across deletion and substitution rules, as introduced in dridan & oepen
>>> (2012):
>>>
>>> https://aclanthology.info/papers/P12-2074/p12-2074
>>>
>>> the LKB implementation predates that work and is known to be deficient
>>> about its characterization in corner cases.
>>>
>>> best wishes, oe
>>>
>>>
>>> On Thu, 26 Apr 2018 at 21:45 Michael Wayne Goodman <goodmami at uw.edu>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> Does anyone know a good way to invoke the LKB's REPP implementation
>>>> from the command line (i.e., just tokenization, no parsing)? I'm currently
>>>> doing this:
>>>>
>>>>     $ "${LOGONROOT}/"bin/logon --tty <<< "(lkb::read-repp
>>>> \"testrpp/test.rpp\")(lkb::repp \"abab\")"
>>>>
>>>> It works, but I get a bunch of Lisp messages that I'm having trouble
>>>> filtering.
>>>>
>>>>     International Allegro CL Enterprise Edition
>>>>     10.0 [64-bit Linux (x86-64)] (Jun 10, 2017 21:22)
>>>>     ...
>>>>     Really exit lisp [n]?
>>>>
>>>> The output I want is within the "..." above. The messages are not on
>>>> stderr, so I can't just redirect 2>/dev/null.
>>>>
>>>> Thanks for any help
>>>>
>>>> --
>>>> Michael Wayne Goodman
>>>>
>>>
>>
>>
>> --
>> Michael Wayne Goodman
>>
>


-- 
Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20180426/9e4df94f/attachment.html>


More information about the developers mailing list