[developers] post-reduction lexical gap

Olga Zamaraeva olzama at uw.edu
Wed Dec 20 21:04:56 CET 2017


I think deleting the space helps. I now don't see the error message in the
parse file. Of course I failed to test this separately from also deleting
the asterisk which was written into the file by the customization system
and entering it back manually, but I will test this later.

Thanks everyone!

On Wed, Dec 20, 2017 at 11:52 AM Kristen Howell <kphowell at uw.edu> wrote:

> Hi Olga, I do not use a space between the * and first word. I'm not sure
> that's the problem but maybe try removing it.
>
> I saw the error message by
> 1) loading the grammar in regression_test/grammars into the lkb
> 2) importing the item file to to incr itsdb() (file-> import-> test items)
> 3) processing the profile (Process-> all items)
>
> The errors were printed to the emacs window.
>
> On Wed, Dec 20, 2017 at 11:27 AM, Olga Zamaraeva <olzama at uw.edu> wrote:
>
>> Kristen, where were you seeing the "No analysis" message? Simply in your
>> LKB window, right?
>>
>> I just did the following:
>>
>> 1) looked into the most recent regression test that I added, looked into
>> the "parse" file under home/gold/languagename. Here's what I see there:
>>
>> 1 at 1@1 at -1@0 at -1@0 at 2@-1 at 1@1 at -1@1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0@-1@
>> -1 at -1@-1 at 220636@-1 at -1@-1 at 23-6-2013 14:28:24 at 0@
>> 2 at 1@2 at -1@0 at -1@0 at 0@-1 at 0@0 at -1@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0@0 at 0@-1@
>> -1 at -1@-1 at 44636@-1 at -1@-1 at 23-6-2013 14:28:24 at post-reduction lexical gap@
>>
>> So, first item is fine, second one has the error message; incidentally,
>> the second one is ungrammatical, so, in the testsuite, it would start from
>> the star, though that should be completely fine, shouldn't it?
>>
>> The item in the corresponding testsuite looks like this:
>>
>> * noun1 noun2 tverb
>>
>> -- there is a space between the star and "noun1"; but I think that's how
>> my testsuites have always looked like. Is that correct or should there be
>> no space, or is there any other caveats with the asterisk?
>>
>>
>> 2) Anyway, then I loaded this grammar (which is stored under
>> gmcs/regression_tests/grammars, until deleted) into the LKB and ran the
>> testsuite with [incr tsdb()]. I don't think I am seeing anything
>> suspicious, in particular no "No analysis" messages:
>>
>> [image: image.png]
>>
>> Interestingly, the star does not appear to be separated from the sentence
>> by a space, I am assuming the system gets rid of it for this view? Not sure
>> if this is relevant.
>>
>> But it must be something going on with the asterisk in my case...
>>
>> Olga
>> On Wed, Dec 20, 2017 at 10:21 AM Kristen Howell <kphowell at uw.edu> wrote:
>>
>>> Thanks! It looks like my text editor had added ^M characters at the end
>>> of each line, so that's the unknown word/lexical gap. I'll update the
>>> discourse page with this as well.
>>>
>>> On Wed, Dec 20, 2017 at 10:00 AM, Emily M. Bender <ebender at uw.edu>
>>> wrote:
>>>
>>>> Hi Kristen,
>>>>
>>>> Two things to try:
>>>>
>>>> (1) Parse the sentence interactively with ACE (rather than the LKB)
>>>> (2) Open the [incr tsdb()] profile and parse the sentence interactively
>>>> (with the LKB) by double-clicking it
>>>>
>>>> Emily
>>>>
>>>> On Wed, Dec 20, 2017 at 9:57 AM, Woodley Packard <
>>>> sweaglesw at sweaglesw.org> wrote:
>>>>
>>>>> Hi again,
>>>>>
>>>>> Dan's assessment is accurate for the case of the ERG and other
>>>>> grammars that implement unknown word handling.  For smaller grammars
>>>>> however the error will be from cases where the input sentence contains a
>>>>> word that is not in the lexicon (which is what I meant by a lexical gap).
>>>>> More specifically, there is a word for which a stem cannot be found in the
>>>>> lexicon using the existing orthographemic rules.  Since Olga was talking
>>>>> about matrix regression testing I suspect this is her context.
>>>>>
>>>>> Woodley
>>>>>
>>>>>
>>>>>
>>>>> On Dec 20, 2017, at 9:38 AM, Kristen Howell <kphowell at uw.edu> wrote:
>>>>>
>>>>> Thanks everyone. I got this parse error a lot with toy grammars that I
>>>>> produced with the grammar matrix. I don't see lfr.tdl anywhere in those
>>>>> output grammars. Does anyone know what might be the grammar matrix
>>>>> equivalent?
>>>>>
>>>>> On Wed, Dec 20, 2017 at 12:57 AM, Dan Flickinger <danf at stanford.edu>
>>>>> wrote:
>>>>>
>>>>>> Hi Olga,
>>>>>>
>>>>>>
>>>>>> This error message from ACE occurs when your grammar includes one or
>>>>>> more token-mapping (preprocessing) rules which conspire to delete all of
>>>>>> the edges from one of the cells in the initial parse chart, so that the
>>>>>> parser cannot hope to find a covering analysis.  For the ERG, this
>>>>>> occasionally happens when I adjust the rules for inserting lexical entries
>>>>>> for unknown words, and then try to filter some of those added entries in
>>>>>> case I already have a "native" entry for that word.  I imagine you must
>>>>>> have one or more rules in a file like the ERG's "lfr.tdl" which discard
>>>>>> some unwanted entries before parsing.  Try commenting out these rules and
>>>>>> see if the error message for that sentence disappears. You can see the
>>>>>> error just by trying to parse that one sentence directly with ACE.
>>>>>>
>>>>>>
>>>>>>  Dan
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>> *From:* developers-bounces at emmtee.net <developers-bounces at emmtee.net>
>>>>>> on behalf of Olga Zamaraeva <olzama at uw.edu>
>>>>>> *Sent:* Tuesday, December 19, 2017 11:07 PM
>>>>>> *To:* Woodley Packard
>>>>>> *Cc:* developers at delph-in.net
>>>>>> *Subject:* Re: [developers] post-reduction lexical gap
>>>>>>
>>>>>> Hi Woodley,
>>>>>>
>>>>>> At a risk of asking a basic question: What does a lexical gap mean?
>>>>>>
>>>>>> Thank you,
>>>>>> Olga
>>>>>> On Tue, Dec 19, 2017 at 9:35 PM Woodley Packard <
>>>>>> sweaglesw at sweaglesw.org> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> This means ACE encountered a lexical gap.  It should not be a rare
>>>>>> occurrence, at least on non-toy data.
>>>>>>
>>>>>> Best, Woodley
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Dec 19, 2017, at 1:21 PM, Stephan Oepen <oe at ifi.uio.no> wrote:
>>>>>>
>>>>>> hi olga,
>>>>>>
>>>>>> this looks like an error message from the parsing client, not [incr
>>>>>> tsdb()] proper.  i would guess maybe ACE?  if so, woodley will likely be
>>>>>> able to shed more light on this question :-).
>>>>>>
>>>>>> best, oe
>>>>>>
>>>>>>
>>>>>> On Tue, 19 Dec 2017 at 19:19 Olga Zamaraeva <olzama at uw.edu> wrote:
>>>>>>
>>>>>> Dear developers,
>>>>>>
>>>>>> What does the message "post-reduction lexical gap" mean in the
>>>>>> context of adding an [ incr tsdb() ] profile for regression testing to the
>>>>>> Grammar Matrix?
>>>>>>
>>>>>> Here's the type of line that we sometimes see in
>>>>>> home/gold/mytest/parse
>>>>>>
>>>>>> 10 at 1@10 at -1@0 at -1@0 at 0@-1 at 0@0 at -1@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at 0@0 at 0
>>>>>> @0 at 0@-1 at -1@-1 at -1@34728 at -1@-1 at -1@23-6-2013 14:28:24 at post-reduction
>>>>>> lexical gap@
>>>>>>
>>>>>> This seems problematic but we don't really know what it is at this
>>>>>> point.
>>>>>>
>>>>>> (Here's a related topic on Discourse:
>>>>>> https://delphinqa.ling.washington.edu/t/no-parses-when-adding-regression-tests/81/7
>>>>>> )
>>>>>>
>>>>>> Thank you,
>>>>>> Olga
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Emily M. Bender
>>>> Professor, Department of Linguistics
>>>> Check out CLMS on facebook! http://www.facebook.com/uwclma
>>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20171220/6b7d53b9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 100783 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20171220/6b7d53b9/attachment-0001.png>


More information about the developers mailing list