[developers] post-reduction lexical gap

Olga Zamaraeva olzama at uw.edu
Wed Dec 20 20:27:26 CET 2017


Kristen, where were you seeing the "No analysis" message? Simply in your
LKB window, right?

I just did the following:

1) looked into the most recent regression test that I added, looked into
the "parse" file under home/gold/languagename. Here's what I see there:

1 at 1@1 at -1@0 at -1@0 at 2@-1 at 1@1 at -1@1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0@-1 at -1@
-1 at -1@220636 at -1@-1 at -1@23-6-2013 14:28:24 at 0@
2 at 1@2 at -1@0 at -1@0 at 0@-1 at 0@0 at -1@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0@-1 at -1@
-1 at -1@44636 at -1@-1 at -1@23-6-2013 14:28:24 at post-reduction lexical gap@

So, first item is fine, second one has the error message; incidentally, the
second one is ungrammatical, so, in the testsuite, it would start from the
star, though that should be completely fine, shouldn't it?

The item in the corresponding testsuite looks like this:

* noun1 noun2 tverb

-- there is a space between the star and "noun1"; but I think that's how my
testsuites have always looked like. Is that correct or should there be no
space, or is there any other caveats with the asterisk?


2) Anyway, then I loaded this grammar (which is stored under
gmcs/regression_tests/grammars, until deleted) into the LKB and ran the
testsuite with [incr tsdb()]. I don't think I am seeing anything
suspicious, in particular no "No analysis" messages:

[image: image.png]

Interestingly, the star does not appear to be separated from the sentence
by a space, I am assuming the system gets rid of it for this view? Not sure
if this is relevant.

But it must be something going on with the asterisk in my case...

Olga
On Wed, Dec 20, 2017 at 10:21 AM Kristen Howell <kphowell at uw.edu> wrote:

> Thanks! It looks like my text editor had added ^M characters at the end of
> each line, so that's the unknown word/lexical gap. I'll update the
> discourse page with this as well.
>
> On Wed, Dec 20, 2017 at 10:00 AM, Emily M. Bender <ebender at uw.edu> wrote:
>
>> Hi Kristen,
>>
>> Two things to try:
>>
>> (1) Parse the sentence interactively with ACE (rather than the LKB)
>> (2) Open the [incr tsdb()] profile and parse the sentence interactively
>> (with the LKB) by double-clicking it
>>
>> Emily
>>
>> On Wed, Dec 20, 2017 at 9:57 AM, Woodley Packard <sweaglesw at sweaglesw.org
>> > wrote:
>>
>>> Hi again,
>>>
>>> Dan's assessment is accurate for the case of the ERG and other grammars
>>> that implement unknown word handling.  For smaller grammars however the
>>> error will be from cases where the input sentence contains a word that is
>>> not in the lexicon (which is what I meant by a lexical gap). More
>>> specifically, there is a word for which a stem cannot be found in the
>>> lexicon using the existing orthographemic rules.  Since Olga was talking
>>> about matrix regression testing I suspect this is her context.
>>>
>>> Woodley
>>>
>>>
>>>
>>> On Dec 20, 2017, at 9:38 AM, Kristen Howell <kphowell at uw.edu> wrote:
>>>
>>> Thanks everyone. I got this parse error a lot with toy grammars that I
>>> produced with the grammar matrix. I don't see lfr.tdl anywhere in those
>>> output grammars. Does anyone know what might be the grammar matrix
>>> equivalent?
>>>
>>> On Wed, Dec 20, 2017 at 12:57 AM, Dan Flickinger <danf at stanford.edu>
>>> wrote:
>>>
>>>> Hi Olga,
>>>>
>>>>
>>>> This error message from ACE occurs when your grammar includes one or
>>>> more token-mapping (preprocessing) rules which conspire to delete all of
>>>> the edges from one of the cells in the initial parse chart, so that the
>>>> parser cannot hope to find a covering analysis.  For the ERG, this
>>>> occasionally happens when I adjust the rules for inserting lexical entries
>>>> for unknown words, and then try to filter some of those added entries in
>>>> case I already have a "native" entry for that word.  I imagine you must
>>>> have one or more rules in a file like the ERG's "lfr.tdl" which discard
>>>> some unwanted entries before parsing.  Try commenting out these rules and
>>>> see if the error message for that sentence disappears. You can see the
>>>> error just by trying to parse that one sentence directly with ACE.
>>>>
>>>>
>>>>  Dan
>>>>
>>>>
>>>> ------------------------------
>>>> *From:* developers-bounces at emmtee.net <developers-bounces at emmtee.net>
>>>> on behalf of Olga Zamaraeva <olzama at uw.edu>
>>>> *Sent:* Tuesday, December 19, 2017 11:07 PM
>>>> *To:* Woodley Packard
>>>> *Cc:* developers at delph-in.net
>>>> *Subject:* Re: [developers] post-reduction lexical gap
>>>>
>>>> Hi Woodley,
>>>>
>>>> At a risk of asking a basic question: What does a lexical gap mean?
>>>>
>>>> Thank you,
>>>> Olga
>>>> On Tue, Dec 19, 2017 at 9:35 PM Woodley Packard <
>>>> sweaglesw at sweaglesw.org> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> This means ACE encountered a lexical gap.  It should not be a rare
>>>> occurrence, at least on non-toy data.
>>>>
>>>> Best, Woodley
>>>>
>>>>
>>>>
>>>> On Dec 19, 2017, at 1:21 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
>>>>
>>>> hi olga,
>>>>
>>>> this looks like an error message from the parsing client, not [incr
>>>> tsdb()] proper.  i would guess maybe ACE?  if so, woodley will likely be
>>>> able to shed more light on this question :-).
>>>>
>>>> best, oe
>>>>
>>>>
>>>> On Tue, 19 Dec 2017 at 19:19 Olga Zamaraeva <olzama at uw.edu> wrote:
>>>>
>>>> Dear developers,
>>>>
>>>> What does the message "post-reduction lexical gap" mean in the context
>>>> of adding an [ incr tsdb() ] profile for regression testing to the Grammar
>>>> Matrix?
>>>>
>>>> Here's the type of line that we sometimes see in home/gold/mytest/parse
>>>>
>>>> 10 at 1@10 at -1@0 at -1@0 at 0@-1 at 0@0 at -1@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0
>>>> @-1 at -1@-1 at -1@34728 at -1@-1 at -1@23-6-2013 14:28:24 at post-reduction lexical
>>>> gap@
>>>>
>>>> This seems problematic but we don't really know what it is at this
>>>> point.
>>>>
>>>> (Here's a related topic on Discourse:
>>>> https://delphinqa.ling.washington.edu/t/no-parses-when-adding-regression-tests/81/7
>>>> )
>>>>
>>>> Thank you,
>>>> Olga
>>>>
>>>>
>>>
>>
>>
>> --
>> Emily M. Bender
>> Professor, Department of Linguistics
>> Check out CLMS on facebook! http://www.facebook.com/uwclma
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20171220/854ea83e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 100783 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20171220/854ea83e/attachment-0001.png>


More information about the developers mailing list