[developers] Skipping non-parsed items in fftb
Emily M. Bender
ebender at uw.edu
Fri Jan 17 03:36:51 CET 2020
Thanks, Mike! I will give this a try.
On Thu, Jan 16, 2020 at 6:33 PM goodman.m.w at gmail.com <goodman.m.w at gmail.com>
> Hi Emily,
> For (2), here is how you could do it with PyDelphin:
> delphin process -g grm.dat original-profile/
> delphin mkprof --full --where 'readings > 0' --source
> original-profile/ new-profile/
> delphin process -g grm.dat --full-forest new-profile/
> Note that original-profile/ is first parsed in regular (non-forest) mode,
> because in full-forest mode the number of readings is essentially unknown
> until they are enumerated and thus the 'readings' field is always 0. The
> second command not only prunes lines in the 'parse' file with readings ==
> 0, but also lines in the 'item' file which correspond to those 'parse'
> lines. Once you have created new-profile/, you can parse again with
> --full-forest for use with FFTB (and of course you don't have to use
> PyDelphin for the parsing steps, if you prefer other means).
> Also note that this results in a profile with no edges for partial parses.
> I think this is what you want. There should be a way to prune the
> full-forest profile directly while keeping partial parses, but while
> investigating this use case I found a bug, so I don't recommend it yet.
> Try `delphin mkprof --help` to see descriptions of these and other
> options. They map fairly directly to the function documented here:
> On Fri, Jan 17, 2020 at 8:44 AM Emily M. Bender <ebender at uw.edu> wrote:
>> Dear all,
>> We are doing some treebanking here at UW with fftb with grammars that
>> have very low coverage over their associated test corpora. The current
>> behavior of fftb with these profiles is to include all items for
>> treebanking, but give a 404 for each one with no parse forest stored. This
>> necessitates clicking the back button and tracking which one is next (since
>> nothing changes color). In that light, two questions:
>> (1) Is there some option we can pass fftb so that it just doesn't present
>> items with no parses?
>> (2) Failing that, is it fairly straightforward with pydelphin, [incr
>> tsdb()] or something else to export a version of the profiles that only
>> includes items which the grammar successfully parsed?
>> Emily M. Bender (she/her)
>> Howard and Frances Nostrand Endowed Professor
>> Department of Linguistics
>> Faculty Director, CLMS
>> University of Washington
>> Twitter: @emilymbender
> -Michael Wayne Goodman
Emily M. Bender (she/her)
Howard and Frances Nostrand Endowed Professor
Department of Linguistics
Faculty Director, CLMS
University of Washington
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the developers