[developers] Skipping non-parsed items in fftb

Emily M. Bender ebender at uw.edu
Sat Jan 18 00:51:48 CET 2020


Apologies --- that error meant I hadn't given the right path to the
testsuite. Correcting that, I now see:

(run_agg) ebender at patas:/home2/kphowell/run_aggregation/output/emb_treebank$
delphin process -g ctn1_grammar_fixed/ace/ctn1.dat ctn_orig/
Traceback (most recent call last):
  File "/home2/kphowell/Envs/run_agg/bin/delphin", line 11, in <module>
    sys.exit(main())
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
line 42, in main
    args.func(args)
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
line 135, in call_process
    gzip=args.gzip)
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
line 542, in process
    column, tablename, condition = _interpret_selection(select, source)
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
line 562, in _interpret_selection
    if len(queryobj['tables']) == 1:
KeyError: 'tables'

On Fri, Jan 17, 2020 at 3:41 PM Emily M. Bender <ebender at uw.edu> wrote:

> Dear Mike,
>
> Alas, I'm hitting this error:
>
> (run_agg) ebender at patas:~$ delphin process -g
> ctn1_grammar_fixed/ace/ctn1.dat ctn_orig/
> Traceback (most recent call last):
>   File "/home2/kphowell/Envs/run_agg/bin/delphin", line 11, in <module>
>     sys.exit(main())
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
> line 42, in main
>     args.func(args)
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
> line 135, in call_process
>     gzip=args.gzip)
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
> line 540, in process
>     source = itsdb.TestSuite(source)
>   File
> "/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/itsdb.py",
> line 644, in __init__
>     '*schema* argument is required for new test suites')
> delphin.itsdb.ITSDBError: *schema* argument is required for new test suites
>
> I'll poke around and see where the schema requirement is coming from
> (nothing in the bit on "process" in the documentation page mentions it),
> but thought I'd post here too in the meantime.
>
> Emily
>
> On Thu, Jan 16, 2020 at 6:46 PM goodman.m.w at gmail.com <
> goodman.m.w at gmail.com> wrote:
>
>> Let me know how it goes.
>>
>> And a clarification: the --full option on `mkprof` doesn't hurt, but it's
>> unnecessary since you're re-parsing the created profile.
>>
>> Also here's the bug report for the other thing, if you're interested in
>> that use case: https://github.com/delph-in/pydelphin/issues/273
>>
>> On Fri, Jan 17, 2020 at 10:37 AM Emily M. Bender <ebender at uw.edu> wrote:
>>
>>> Thanks, Mike! I will give this a try.
>>>
>>> On Thu, Jan 16, 2020 at 6:33 PM goodman.m.w at gmail.com <
>>> goodman.m.w at gmail.com> wrote:
>>>
>>>> Hi Emily,
>>>>
>>>> For (2), here is how you could do it with PyDelphin:
>>>>
>>>>     delphin process -g grm.dat original-profile/
>>>>     delphin mkprof --full --where 'readings > 0' --source
>>>> original-profile/ new-profile/
>>>>     delphin process -g grm.dat --full-forest new-profile/
>>>>
>>>> Note that original-profile/ is first parsed in regular (non-forest)
>>>> mode, because in full-forest mode the number of readings is essentially
>>>> unknown until they are enumerated and thus the 'readings' field is always
>>>> 0. The second command not only prunes lines in the 'parse' file with
>>>> readings == 0, but also lines in the 'item' file which correspond to those
>>>> 'parse' lines. Once you have created new-profile/, you can parse again with
>>>> --full-forest for use with FFTB (and of course you don't have to use
>>>> PyDelphin for the parsing steps, if you prefer other means).
>>>>
>>>> Also note that this results in a profile with no edges for partial
>>>> parses. I think this is what you want. There should be a way to prune the
>>>> full-forest profile directly while keeping partial parses, but while
>>>> investigating this use case I found a bug, so I don't recommend it yet.
>>>>
>>>> Try `delphin mkprof --help` to see descriptions of these and other
>>>> options. They map fairly directly to the function documented here:
>>>> https://pydelphin.readthedocs.io/en/latest/api/delphin.commands.html
>>>> #mkprof
>>>>
>>>>
>>>> On Fri, Jan 17, 2020 at 8:44 AM Emily M. Bender <ebender at uw.edu> wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>> We are doing some treebanking here at UW with fftb with grammars that
>>>>> have very low coverage over their associated test corpora. The current
>>>>> behavior of fftb with these profiles is to include all items for
>>>>> treebanking, but give a 404 for each one with no parse forest stored. This
>>>>> necessitates clicking the back button and tracking which one is next (since
>>>>> nothing changes color). In that light, two questions:
>>>>>
>>>>> (1) Is there some option we can pass fftb so that it just doesn't
>>>>> present items with no parses?
>>>>> (2) Failing that, is it fairly straightforward with pydelphin, [incr
>>>>> tsdb()] or something else to export a version of the profiles that only
>>>>> includes items which the grammar successfully parsed?
>>>>>
>>>>> Thanks,
>>>>> Emily
>>>>>
>>>>>
>>>>> --
>>>>> Emily M. Bender (she/her)
>>>>> Howard and Frances Nostrand Endowed Professor
>>>>> Department of Linguistics
>>>>> Faculty Director, CLMS
>>>>> University of Washington
>>>>> Twitter: @emilymbender
>>>>>
>>>>
>>>>
>>>> --
>>>> -Michael Wayne Goodman
>>>>
>>> --
>>> Emily M. Bender (she/her)
>>> Howard and Frances Nostrand Endowed Professor
>>> Department of Linguistics
>>> Faculty Director, CLMS
>>> University of Washington
>>> Twitter: @emilymbender
>>>
>>
>>
>> --
>> -Michael Wayne Goodman
>>
>
>
> --
> Emily M. Bender (she/her)
> Howard and Frances Nostrand Endowed Professor
> Department of Linguistics
> Faculty Director, CLMS
> University of Washington
> Twitter: @emilymbender
>


-- 
Emily M. Bender (she/her)
Howard and Frances Nostrand Endowed Professor
Department of Linguistics
Faculty Director, CLMS
University of Washington
Twitter: @emilymbender
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200117/fd77447d/attachment-0001.html>


More information about the developers mailing list