[developers] Browsing items with lexical coverage using itsdb profiles created with pydelphin/ace

goodman.m.w at gmail.com goodman.m.w at gmail.com
Sat Sep 28 01:56:13 CEST 2019


Hi Kristen,

The item file and the item schema in the relations file both have 15
fields, so I don't think there is disagreement there (although I had some
encoding issues with the angled quotes on a comment in the relations file;
i just fixed it manually).

PyDelphin uses ACE's stdout protocols (see:
https://pydelphin.readthedocs.io/en/latest/api/delphin.ace.html#ace-stdout-protocols).
By default PyDelphin uses the --tsdb-stdout option of ACE to get as much
information as ACE can provide. If ACE provides the :error information,
PyDelphin will populate the corresponding field in a profile. From what I
recall, however, ACE does not output this field as consistently as the LKB
and PET, and sometimes it puts parsing errors on the stderr stream instead,
which PyDelphin does not capture.


On Sat, Sep 28, 2019 at 4:53 AM Kristen Howell <kphowell at uw.edu> wrote:

> Perhaps there is some disagreement between my item and relations files? I
> generated the item file using the xigt exporter. I believe this is the
> corresponding relation file (it's the one I point to when using the
> exporter). I've attached both. I am creating the profile with the following
> steps (in python):
>  ts = itsdb.TestSuite('./unprocessed/wmb/')
>  ace.compile('./wmb/ace/config.tdl', './wmb/ace/wmb.dat')
>  with ace.AceParser('./wmb/ace/wmb.dat') as cpu:
>         ts.process(cpu)
>     ts.write(path='./output/processed/wmb'r)
>
>
> On Fri, Sep 27, 2019 at 1:06 PM Stephan Oepen <oe at ifi.uio.no> wrote:
>
>> yes, the 'parse' file (like the other files in a tsdb(1) database) is
>> a textual encoding of a set of tuples.  what you quote looks
>> suspiciously spartan to me, with only the first three fields filled
>> and the number of 'readings' filled in.  in a regular profile, i would
>> expect a record of the initial and internal tokenization, various
>> timings, and statistics about lexical instantiation and chart
>> construction.  i am relatively sure that ACE does account for most of
>> these, so i suspect that information is getting lost somewhere in your
>> pipeline.
>>
>> oe
>>
>> On Fri, Sep 27, 2019 at 9:56 PM Kristen Howell <kphowell at uw.edu> wrote:
>> >
>> > Thank you Stephan. Would the 'parse' relations be the lines the parse
>> file? They each look something like this:
>> > 0 at 0@0 at -1@@-1@@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1@
>> -1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@@@
>> > Perhaps this means that the error field among other things is not being
>> populated?
>> > Then the question for Mike and/or Woodley would be if it is expected to
>> be populated.
>> >
>> > On Fri, Sep 27, 2019 at 12:33 PM Stephan Oepen <oe at ifi.uio.no> wrote:
>> >>
>> >> hi kristen,
>> >>
>> >> i had to peak at the [incr tsdb()] code myself; 'Browse Errors' will
>> >> extract all items where the 'error' field (in the 'parse' relation) is
>> >> a non-empty string.  so, if nothing comes up there, presumably there
>> >> either were not errors, or ACE does not populate that field?
>> >>
>> >> likewise, the pre-canned 'unproblematic' condition amounts to 'error
>> >> == ""', i.e. an empty string in that field.  to some degree, what to
>> >> consider an 'error' is arguably up to the parsing engine.  from
>> >> memory, i believe that both the LKB and PET will generate some
>> >> descriptive 'error' string for example in case of missing lexical
>> >> entries for some of the input tokens.
>> >>
>> >> it appears that ACE (or pyDelphin, not sure about the division of
>> >> labor here) maybe simply does not populate the 'error' field in the
>> >> profiles that it generates?
>> >>
>> >> best wishes, oe
>> >>
>> >> On Fri, Sep 27, 2019 at 7:09 PM Kristen Howell <kphowell at uw.edu>
>> wrote:
>> >> >
>> >> > Hi Mike and Woodley (and others?),
>> >> >
>> >> > I've created some itsdb profiles using pydelphin and a grammar
>> loaded in ace. I am trying to browse the profile in [incr tsdb()]. The
>> results and coverage show up fine. However, when I try to browse errors,
>> nothing happens. Also when I try to view items with lexical coverage (using
>> tsdl condition--> unproblematic and then browse --> test items), I see all
>> of the items, not just those with lexical coverage.
>> >> >
>> >> > Is this expected to work with pydelphin profiles? If so, what might
>> be missing? My profile contains non empty item, parse, result, relations,
>> run files.
>> >> >
>> >> > Thanks for your help,
>> >> > Kristen
>>
>

-- 
-Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190928/27042f5a/attachment.html>


More information about the developers mailing list