[developers] Skipping non-parsed items in fftb

Emily M. Bender ebender at uw.edu
Fri Jan 17 03:36:51 CET 2020


Thanks, Mike! I will give this a try.

On Thu, Jan 16, 2020 at 6:33 PM goodman.m.w at gmail.com <goodman.m.w at gmail.com>
wrote:

> Hi Emily,
>
> For (2), here is how you could do it with PyDelphin:
>
>     delphin process -g grm.dat original-profile/
>     delphin mkprof --full --where 'readings > 0' --source
> original-profile/ new-profile/
>     delphin process -g grm.dat --full-forest new-profile/
>
> Note that original-profile/ is first parsed in regular (non-forest) mode,
> because in full-forest mode the number of readings is essentially unknown
> until they are enumerated and thus the 'readings' field is always 0. The
> second command not only prunes lines in the 'parse' file with readings ==
> 0, but also lines in the 'item' file which correspond to those 'parse'
> lines. Once you have created new-profile/, you can parse again with
> --full-forest for use with FFTB (and of course you don't have to use
> PyDelphin for the parsing steps, if you prefer other means).
>
> Also note that this results in a profile with no edges for partial parses.
> I think this is what you want. There should be a way to prune the
> full-forest profile directly while keeping partial parses, but while
> investigating this use case I found a bug, so I don't recommend it yet.
>
> Try `delphin mkprof --help` to see descriptions of these and other
> options. They map fairly directly to the function documented here:
> https://pydelphin.readthedocs.io/en/latest/api/delphin.commands.html
> #mkprof
>
>
> On Fri, Jan 17, 2020 at 8:44 AM Emily M. Bender <ebender at uw.edu> wrote:
>
>> Dear all,
>>
>> We are doing some treebanking here at UW with fftb with grammars that
>> have very low coverage over their associated test corpora. The current
>> behavior of fftb with these profiles is to include all items for
>> treebanking, but give a 404 for each one with no parse forest stored. This
>> necessitates clicking the back button and tracking which one is next (since
>> nothing changes color). In that light, two questions:
>>
>> (1) Is there some option we can pass fftb so that it just doesn't present
>> items with no parses?
>> (2) Failing that, is it fairly straightforward with pydelphin, [incr
>> tsdb()] or something else to export a version of the profiles that only
>> includes items which the grammar successfully parsed?
>>
>> Thanks,
>> Emily
>>
>>
>> --
>> Emily M. Bender (she/her)
>> Howard and Frances Nostrand Endowed Professor
>> Department of Linguistics
>> Faculty Director, CLMS
>> University of Washington
>> Twitter: @emilymbender
>>
>
>
> --
> -Michael Wayne Goodman
>
-- 
Emily M. Bender (she/her)
Howard and Frances Nostrand Endowed Professor
Department of Linguistics
Faculty Director, CLMS
University of Washington
Twitter: @emilymbender
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200116/539aefaf/attachment-0001.html>


More information about the developers mailing list