[developers] Skipping non-parsed items in fftb

goodman.m.w at gmail.com goodman.m.w at gmail.com
Fri Jan 17 03:33:03 CET 2020


Hi Emily,

For (2), here is how you could do it with PyDelphin:

    delphin process -g grm.dat original-profile/
    delphin mkprof --full --where 'readings > 0' --source original-profile/
new-profile/
    delphin process -g grm.dat --full-forest new-profile/

Note that original-profile/ is first parsed in regular (non-forest) mode,
because in full-forest mode the number of readings is essentially unknown
until they are enumerated and thus the 'readings' field is always 0. The
second command not only prunes lines in the 'parse' file with readings ==
0, but also lines in the 'item' file which correspond to those 'parse'
lines. Once you have created new-profile/, you can parse again with
--full-forest for use with FFTB (and of course you don't have to use
PyDelphin for the parsing steps, if you prefer other means).

Also note that this results in a profile with no edges for partial parses.
I think this is what you want. There should be a way to prune the
full-forest profile directly while keeping partial parses, but while
investigating this use case I found a bug, so I don't recommend it yet.

Try `delphin mkprof --help` to see descriptions of these and other options.
They map fairly directly to the function documented here:
https://pydelphin.readthedocs.io/en/latest/api/delphin.commands.html#mkprof


On Fri, Jan 17, 2020 at 8:44 AM Emily M. Bender <ebender at uw.edu> wrote:

> Dear all,
>
> We are doing some treebanking here at UW with fftb with grammars that have
> very low coverage over their associated test corpora. The current behavior
> of fftb with these profiles is to include all items for treebanking, but
> give a 404 for each one with no parse forest stored. This necessitates
> clicking the back button and tracking which one is next (since
> nothing changes color). In that light, two questions:
>
> (1) Is there some option we can pass fftb so that it just doesn't present
> items with no parses?
> (2) Failing that, is it fairly straightforward with pydelphin, [incr
> tsdb()] or something else to export a version of the profiles that only
> includes items which the grammar successfully parsed?
>
> Thanks,
> Emily
>
>
> --
> Emily M. Bender (she/her)
> Howard and Frances Nostrand Endowed Professor
> Department of Linguistics
> Faculty Director, CLMS
> University of Washington
> Twitter: @emilymbender
>


-- 
-Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200117/b91af7a2/attachment.html>


More information about the developers mailing list