[developers] post-reduction lexical gap

Wed Dec 20 20:51:57 CET 2017

Hi Olga, I do not use a space between the * and first word. I'm not sure
that's the problem but maybe try removing it.

I saw the error message by
1) loading the grammar in regression_test/grammars into the lkb
2) importing the item file to to incr itsdb() (file-> import-> test items)
3) processing the profile (Process-> all items)

The errors were printed to the emacs window.

On Wed, Dec 20, 2017 at 11:27 AM, Olga Zamaraeva <olzama at uw.edu> wrote:

> Kristen, where were you seeing the "No analysis" message? Simply in your
> LKB window, right?
>
> I just did the following:
>
> 1) looked into the most recent regression test that I added, looked into
> the "parse" file under home/gold/languagename. Here's what I see there:
>
> 1 at 1@1 at -1@0 at -1@0 at 2@-1 at 1@1 at -1@1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0@-1@
> -1 at -1@-1 at 220636@-1 at -1@-1 at 23-6-2013 14:28:24 at 0@
> 2 at 1@2 at -1@0 at -1@0 at 0@-1 at 0@0 at -1@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0@-1@
> -1 at -1@-1 at 44636@-1 at -1@-1 at 23-6-2013 14:28:24 at post-reduction lexical gap@
>
> So, first item is fine, second one has the error message; incidentally,
> the second one is ungrammatical, so, in the testsuite, it would start from
> the star, though that should be completely fine, shouldn't it?
>
> The item in the corresponding testsuite looks like this:
>
> * noun1 noun2 tverb
>
> -- there is a space between the star and "noun1"; but I think that's how
> my testsuites have always looked like. Is that correct or should there be
> no space, or is there any other caveats with the asterisk?
>
>
> 2) Anyway, then I loaded this grammar (which is stored under
> gmcs/regression_tests/grammars, until deleted) into the LKB and ran the
> testsuite with [incr tsdb()]. I don't think I am seeing anything
> suspicious, in particular no "No analysis" messages:
>
> [image: image.png]
>
> Interestingly, the star does not appear to be separated from the sentence
> by a space, I am assuming the system gets rid of it for this view? Not sure
> if this is relevant.
>
> But it must be something going on with the asterisk in my case...
>
> Olga
> On Wed, Dec 20, 2017 at 10:21 AM Kristen Howell <kphowell at uw.edu> wrote:
>
>> Thanks! It looks like my text editor had added ^M characters at the end
>> of each line, so that's the unknown word/lexical gap. I'll update the
>> discourse page with this as well.
>>
>> On Wed, Dec 20, 2017 at 10:00 AM, Emily M. Bender <ebender at uw.edu> wrote:
>>
>>> Hi Kristen,
>>>
>>> Two things to try:
>>>
>>> (1) Parse the sentence interactively with ACE (rather than the LKB)
>>> (2) Open the [incr tsdb()] profile and parse the sentence interactively
>>> (with the LKB) by double-clicking it
>>>
>>> Emily
>>>
>>> On Wed, Dec 20, 2017 at 9:57 AM, Woodley Packard <
>>> sweaglesw at sweaglesw.org> wrote:
>>>
>>>> Hi again,
>>>>
>>>> Dan's assessment is accurate for the case of the ERG and other grammars
>>>> that implement unknown word handling.  For smaller grammars however the
>>>> error will be from cases where the input sentence contains a word that is
>>>> not in the lexicon (which is what I meant by a lexical gap). More
>>>> specifically, there is a word for which a stem cannot be found in the
>>>> lexicon using the existing orthographemic rules.  Since Olga was talking
>>>> about matrix regression testing I suspect this is her context.
>>>>
>>>> Woodley
>>>>
>>>>
>>>>
>>>> On Dec 20, 2017, at 9:38 AM, Kristen Howell <kphowell at uw.edu> wrote:
>>>>
>>>> Thanks everyone. I got this parse error a lot with toy grammars that I
>>>> produced with the grammar matrix. I don't see lfr.tdl anywhere in those
>>>> output grammars. Does anyone know what might be the grammar matrix
>>>> equivalent?
>>>>
>>>> On Wed, Dec 20, 2017 at 12:57 AM, Dan Flickinger <danf at stanford.edu>
>>>> wrote:
>>>>
>>>>> Hi Olga,
>>>>>
>>>>>
>>>>> This error message from ACE occurs when your grammar includes one or
>>>>> more token-mapping (preprocessing) rules which conspire to delete all of
>>>>> the edges from one of the cells in the initial parse chart, so that the
>>>>> parser cannot hope to find a covering analysis.  For the ERG, this
>>>>> occasionally happens when I adjust the rules for inserting lexical entries
>>>>> for unknown words, and then try to filter some of those added entries in
>>>>> case I already have a "native" entry for that word.  I imagine you must
>>>>> have one or more rules in a file like the ERG's "lfr.tdl" which discard
>>>>> some unwanted entries before parsing.  Try commenting out these rules and
>>>>> see if the error message for that sentence disappears. You can see the
>>>>> error just by trying to parse that one sentence directly with ACE.
>>>>>
>>>>>
>>>>>  Dan
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> *From:* developers-bounces at emmtee.net <developers-bounces at emmtee.net>
>>>>> on behalf of Olga Zamaraeva <olzama at uw.edu>
>>>>> *Sent:* Tuesday, December 19, 2017 11:07 PM
>>>>> *To:* Woodley Packard
>>>>> *Cc:* developers at delph-in.net
>>>>> *Subject:* Re: [developers] post-reduction lexical gap
>>>>>
>>>>> Hi Woodley,
>>>>>
>>>>> At a risk of asking a basic question: What does a lexical gap mean?
>>>>>
>>>>> Thank you,
>>>>> Olga
>>>>> On Tue, Dec 19, 2017 at 9:35 PM Woodley Packard <
>>>>> sweaglesw at sweaglesw.org> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> This means ACE encountered a lexical gap.  It should not be a rare
>>>>> occurrence, at least on non-toy data.
>>>>>
>>>>> Best, Woodley
>>>>>
>>>>>
>>>>>
>>>>> On Dec 19, 2017, at 1:21 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
>>>>>
>>>>> hi olga,
>>>>>
>>>>> this looks like an error message from the parsing client, not [incr
>>>>> tsdb()] proper.  i would guess maybe ACE?  if so, woodley will likely be
>>>>> able to shed more light on this question :-).
>>>>>
>>>>> best, oe
>>>>>
>>>>>
>>>>> On Tue, 19 Dec 2017 at 19:19 Olga Zamaraeva <olzama at uw.edu> wrote:
>>>>>
>>>>> Dear developers,
>>>>>
>>>>> What does the message "post-reduction lexical gap" mean in the context
>>>>> of adding an [ incr tsdb() ] profile for regression testing to the Grammar
>>>>> Matrix?
>>>>>
>>>>> Here's the type of line that we sometimes see in home/gold/mytest/parse
>>>>>
>>>>> 10 at 1@10 at -1@0 at -1@0 at 0@-1 at 0@0 at -1@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0
>>>>> @0 at 0@-1 at -1@-1 at -1@34728 at -1@-1 at -1@23-6-2013 14:28:24 at post-reduction
>>>>> lexical gap@
>>>>>
>>>>> This seems problematic but we don't really know what it is at this
>>>>> point.
>>>>>
>>>>> (Here's a related topic on Discourse: https://delphinqa.
>>>>> ling.washington.edu/t/no-parses-when-adding-regression-tests/81/7)
>>>>>
>>>>> Thank you,
>>>>> Olga
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Emily M. Bender
>>> Professor, Department of Linguistics
>>> Check out CLMS on facebook! http://www.facebook.com/uwclma
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20171220/d9099028/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 100783 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20171220/d9099028/attachment-0001.png>