[developers] Skipping non-parsed items in fftb

goodman.m.w at gmail.com goodman.m.w at gmail.com
Sat Jan 18 01:17:13 CET 2020


Hi Emily,

Yes those error messages are not very clear. But the second one looks like
old code, as 'tables' is no longer a key in the object it's being looked up
on. I suggest making sure that your run_agg environment has an updated
version of PyDelphin. While the environment is active, try `pip install -U
pydelphin` and make sure it has a 1.0 or newer version (`delphin
--version`), then try again.

On Sat, Jan 18, 2020 at 7:52 AM Emily M. Bender <ebender at uw.edu> wrote:

> Apologies --- that error meant I hadn't given the right path to the
> testsuite. Correcting that, I now see:
>
> (run_agg) ebender at patas:/home2/kphowell/run_aggregation/output/emb_treebank$
> delphin process -g ctn1_grammar_fixed/ace/ctn1.dat ctn_orig/
> Traceback (most recent call last):
>   File "/home2/kphowell/Envs/run_agg/bin/delphin", line 11, in <module>
>     sys.exit(main())
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
> line 42, in main
>     args.func(args)
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
> line 135, in call_process
>     gzip=args.gzip)
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
> line 542, in process
>     column, tablename, condition = _interpret_selection(select, source)
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
> line 562, in _interpret_selection
>     if len(queryobj['tables']) == 1:
> KeyError: 'tables'
>
> On Fri, Jan 17, 2020 at 3:41 PM Emily M. Bender <ebender at uw.edu> wrote:
>
>> Dear Mike,
>>
>> Alas, I'm hitting this error:
>>
>> (run_agg) ebender at patas:~$ delphin process -g
>> ctn1_grammar_fixed/ace/ctn1.dat ctn_orig/
>> Traceback (most recent call last):
>>   File "/home2/kphowell/Envs/run_agg/bin/delphin", line 11, in <module>
>>     sys.exit(main())
>>   File
>> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
>> line 42, in main
>>     args.func(args)
>>   File
>> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
>> line 135, in call_process
>>     gzip=args.gzip)
>>   File
>> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
>> line 540, in process
>>     source = itsdb.TestSuite(source)
>>   File
>> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/itsdb.py",
>> line 644, in __init__
>>     '*schema* argument is required for new test suites')
>> delphin.itsdb.ITSDBError: *schema* argument is required for new test
>> suites
>>
>> I'll poke around and see where the schema requirement is coming from
>> (nothing in the bit on "process" in the documentation page mentions it),
>> but thought I'd post here too in the meantime.
>>
>> Emily
>>
>> On Thu, Jan 16, 2020 at 6:46 PM goodman.m.w at gmail.com <
>> goodman.m.w at gmail.com> wrote:
>>
>>> Let me know how it goes.
>>>
>>> And a clarification: the --full option on `mkprof` doesn't hurt, but
>>> it's unnecessary since you're re-parsing the created profile.
>>>
>>> Also here's the bug report for the other thing, if you're interested in
>>> that use case: https://github.com/delph-in/pydelphin/issues/273
>>>
>>> On Fri, Jan 17, 2020 at 10:37 AM Emily M. Bender <ebender at uw.edu> wrote:
>>>
>>>> Thanks, Mike! I will give this a try.
>>>>
>>>> On Thu, Jan 16, 2020 at 6:33 PM goodman.m.w at gmail.com <
>>>> goodman.m.w at gmail.com> wrote:
>>>>
>>>>> Hi Emily,
>>>>>
>>>>> For (2), here is how you could do it with PyDelphin:
>>>>>
>>>>>     delphin process -g grm.dat original-profile/
>>>>>     delphin mkprof --full --where 'readings > 0' --source
>>>>> original-profile/ new-profile/
>>>>>     delphin process -g grm.dat --full-forest new-profile/
>>>>>
>>>>> Note that original-profile/ is first parsed in regular (non-forest)
>>>>> mode, because in full-forest mode the number of readings is essentially
>>>>> unknown until they are enumerated and thus the 'readings' field is always
>>>>> 0. The second command not only prunes lines in the 'parse' file with
>>>>> readings == 0, but also lines in the 'item' file which correspond to those
>>>>> 'parse' lines. Once you have created new-profile/, you can parse again with
>>>>> --full-forest for use with FFTB (and of course you don't have to use
>>>>> PyDelphin for the parsing steps, if you prefer other means).
>>>>>
>>>>> Also note that this results in a profile with no edges for partial
>>>>> parses. I think this is what you want. There should be a way to prune the
>>>>> full-forest profile directly while keeping partial parses, but while
>>>>> investigating this use case I found a bug, so I don't recommend it yet.
>>>>>
>>>>> Try `delphin mkprof --help` to see descriptions of these and other
>>>>> options. They map fairly directly to the function documented here:
>>>>> https://pydelphin.readthedocs.io/en/latest/api/delphin.commands.html
>>>>> #mkprof
>>>>>
>>>>>
>>>>> On Fri, Jan 17, 2020 at 8:44 AM Emily M. Bender <ebender at uw.edu>
>>>>> wrote:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> We are doing some treebanking here at UW with fftb with grammars that
>>>>>> have very low coverage over their associated test corpora. The current
>>>>>> behavior of fftb with these profiles is to include all items for
>>>>>> treebanking, but give a 404 for each one with no parse forest stored. This
>>>>>> necessitates clicking the back button and tracking which one is next (since
>>>>>> nothing changes color). In that light, two questions:
>>>>>>
>>>>>> (1) Is there some option we can pass fftb so that it just doesn't
>>>>>> present items with no parses?
>>>>>> (2) Failing that, is it fairly straightforward with pydelphin, [incr
>>>>>> tsdb()] or something else to export a version of the profiles that only
>>>>>> includes items which the grammar successfully parsed?
>>>>>>
>>>>>> Thanks,
>>>>>> Emily
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Emily M. Bender (she/her)
>>>>>> Howard and Frances Nostrand Endowed Professor
>>>>>> Department of Linguistics
>>>>>> Faculty Director, CLMS
>>>>>> University of Washington
>>>>>> Twitter: @emilymbender
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -Michael Wayne Goodman
>>>>>
>>>> --
>>>> Emily M. Bender (she/her)
>>>> Howard and Frances Nostrand Endowed Professor
>>>> Department of Linguistics
>>>> Faculty Director, CLMS
>>>> University of Washington
>>>> Twitter: @emilymbender
>>>>
>>>
>>>
>>> --
>>> -Michael Wayne Goodman
>>>
>>
>>
>> --
>> Emily M. Bender (she/her)
>> Howard and Frances Nostrand Endowed Professor
>> Department of Linguistics
>> Faculty Director, CLMS
>> University of Washington
>> Twitter: @emilymbender
>>
>
>
> --
> Emily M. Bender (she/her)
> Howard and Frances Nostrand Endowed Professor
> Department of Linguistics
> Faculty Director, CLMS
> University of Washington
> Twitter: @emilymbender
>


-- 
-Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200118/c493c689/attachment.html>


More information about the developers mailing list