[developers] Skipping non-parsed items in fftb

Emily M. Bender ebender at uw.edu
Sat Jan 18 00:41:16 CET 2020


Dear Mike,

Alas, I'm hitting this error:

(run_agg) ebender at patas:~$ delphin process -g
ctn1_grammar_fixed/ace/ctn1.dat ctn_orig/
Traceback (most recent call last):
  File "/home2/kphowell/Envs/run_agg/bin/delphin", line 11, in <module>
    sys.exit(main())
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
line 42, in main
    args.func(args)
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/main.py",
line 135, in call_process
    gzip=args.gzip)
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/commands.py",
line 540, in process
    source = itsdb.TestSuite(source)
  File
"/home2/kphowell/Envs/run_agg/lib/python3.6/site-packages/delphin/itsdb.py",
line 644, in __init__
    '*schema* argument is required for new test suites')
delphin.itsdb.ITSDBError: *schema* argument is required for new test suites

I'll poke around and see where the schema requirement is coming from
(nothing in the bit on "process" in the documentation page mentions it),
but thought I'd post here too in the meantime.

Emily

On Thu, Jan 16, 2020 at 6:46 PM goodman.m.w at gmail.com <goodman.m.w at gmail.com>
wrote:

> Let me know how it goes.
>
> And a clarification: the --full option on `mkprof` doesn't hurt, but it's
> unnecessary since you're re-parsing the created profile.
>
> Also here's the bug report for the other thing, if you're interested in
> that use case: https://github.com/delph-in/pydelphin/issues/273
>
> On Fri, Jan 17, 2020 at 10:37 AM Emily M. Bender <ebender at uw.edu> wrote:
>
>> Thanks, Mike! I will give this a try.
>>
>> On Thu, Jan 16, 2020 at 6:33 PM goodman.m.w at gmail.com <
>> goodman.m.w at gmail.com> wrote:
>>
>>> Hi Emily,
>>>
>>> For (2), here is how you could do it with PyDelphin:
>>>
>>>     delphin process -g grm.dat original-profile/
>>>     delphin mkprof --full --where 'readings > 0' --source
>>> original-profile/ new-profile/
>>>     delphin process -g grm.dat --full-forest new-profile/
>>>
>>> Note that original-profile/ is first parsed in regular (non-forest)
>>> mode, because in full-forest mode the number of readings is essentially
>>> unknown until they are enumerated and thus the 'readings' field is always
>>> 0. The second command not only prunes lines in the 'parse' file with
>>> readings == 0, but also lines in the 'item' file which correspond to those
>>> 'parse' lines. Once you have created new-profile/, you can parse again with
>>> --full-forest for use with FFTB (and of course you don't have to use
>>> PyDelphin for the parsing steps, if you prefer other means).
>>>
>>> Also note that this results in a profile with no edges for partial
>>> parses. I think this is what you want. There should be a way to prune the
>>> full-forest profile directly while keeping partial parses, but while
>>> investigating this use case I found a bug, so I don't recommend it yet.
>>>
>>> Try `delphin mkprof --help` to see descriptions of these and other
>>> options. They map fairly directly to the function documented here:
>>> https://pydelphin.readthedocs.io/en/latest/api/delphin.commands.html
>>> #mkprof
>>>
>>>
>>> On Fri, Jan 17, 2020 at 8:44 AM Emily M. Bender <ebender at uw.edu> wrote:
>>>
>>>> Dear all,
>>>>
>>>> We are doing some treebanking here at UW with fftb with grammars that
>>>> have very low coverage over their associated test corpora. The current
>>>> behavior of fftb with these profiles is to include all items for
>>>> treebanking, but give a 404 for each one with no parse forest stored. This
>>>> necessitates clicking the back button and tracking which one is next (since
>>>> nothing changes color). In that light, two questions:
>>>>
>>>> (1) Is there some option we can pass fftb so that it just doesn't
>>>> present items with no parses?
>>>> (2) Failing that, is it fairly straightforward with pydelphin, [incr
>>>> tsdb()] or something else to export a version of the profiles that only
>>>> includes items which the grammar successfully parsed?
>>>>
>>>> Thanks,
>>>> Emily
>>>>
>>>>
>>>> --
>>>> Emily M. Bender (she/her)
>>>> Howard and Frances Nostrand Endowed Professor
>>>> Department of Linguistics
>>>> Faculty Director, CLMS
>>>> University of Washington
>>>> Twitter: @emilymbender
>>>>
>>>
>>>
>>> --
>>> -Michael Wayne Goodman
>>>
>> --
>> Emily M. Bender (she/her)
>> Howard and Frances Nostrand Endowed Professor
>> Department of Linguistics
>> Faculty Director, CLMS
>> University of Washington
>> Twitter: @emilymbender
>>
>
>
> --
> -Michael Wayne Goodman
>


-- 
Emily M. Bender (she/her)
Howard and Frances Nostrand Endowed Professor
Department of Linguistics
Faculty Director, CLMS
University of Washington
Twitter: @emilymbender
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200117/b973fd5e/attachment.html>


More information about the developers mailing list