[developers] Browsing items with lexical coverage using itsdb profiles created with pydelphin/ace

Mon Sep 30 19:30:26 CEST 2019

I get the error field populated when I use “art” to record profiles.  Are you passing —tsdb-notes to ace?  It may help.

Woodley

> On Sep 30, 2019, at 7:59 AM, Kristen Howell <kphowell at uw.edu> wrote:
> 
> Thanks Mike. You're right- the error information is showing up in stderr, rather than std out, so that is why PyDelphin isn't picking them up.
> So it sounds like I'm out of luck as far as generating profiles using Ace and then inspecting them with [incr tsdb()]. I will either need to use LBK/PET to parse, or look at the stderr from Ace to see my lexical coverage.
> Unless Woodley, is there a way command/option in Ace to send parse errors to stdout?
> 
>> On Fri, Sep 27, 2019 at 4:56 PM goodman.m.w at gmail.com <goodman.m.w at gmail.com> wrote:
>> Hi Kristen,
>> 
>> The item file and the item schema in the relations file both have 15 fields, so I don't think there is disagreement there (although I had some encoding issues with the angled quotes on a comment in the relations file; i just fixed it manually).
>> 
>> PyDelphin uses ACE's stdout protocols (see: https://pydelphin.readthedocs.io/en/latest/api/delphin.ace.html#ace-stdout-protocols). By default PyDelphin uses the --tsdb-stdout option of ACE to get as much information as ACE can provide. If ACE provides the :error information, PyDelphin will populate the corresponding field in a profile. From what I recall, however, ACE does not output this field as consistently as the LKB and PET, and sometimes it puts parsing errors on the stderr stream instead, which PyDelphin does not capture.
>> 
>> 
>>> On Sat, Sep 28, 2019 at 4:53 AM Kristen Howell <kphowell at uw.edu> wrote:
>>> Perhaps there is some disagreement between my item and relations files? I generated the item file using the xigt exporter. I believe this is the corresponding relation file (it's the one I point to when using the exporter). I've attached both. I am creating the profile with the following steps (in python):
>>>  ts = itsdb.TestSuite('./unprocessed/wmb/')
>>>  ace.compile('./wmb/ace/config.tdl', './wmb/ace/wmb.dat')
>>>  with ace.AceParser('./wmb/ace/wmb.dat') as cpu:
>>>         ts.process(cpu)
>>>     ts.write(path='./output/processed/wmb'r)
>>> 
>>> 
>>>> On Fri, Sep 27, 2019 at 1:06 PM Stephan Oepen <oe at ifi.uio.no> wrote:
>>>> yes, the 'parse' file (like the other files in a tsdb(1) database) is
>>>> a textual encoding of a set of tuples.  what you quote looks
>>>> suspiciously spartan to me, with only the first three fields filled
>>>> and the number of 'readings' filled in.  in a regular profile, i would
>>>> expect a record of the initial and internal tokenization, various
>>>> timings, and statistics about lexical instantiation and chart
>>>> construction.  i am relatively sure that ACE does account for most of
>>>> these, so i suspect that information is getting lost somewhere in your
>>>> pipeline.
>>>> 
>>>> oe
>>>> 
>>>> On Fri, Sep 27, 2019 at 9:56 PM Kristen Howell <kphowell at uw.edu> wrote:
>>>> >
>>>> > Thank you Stephan. Would the 'parse' relations be the lines the parse file? They each look something like this:
>>>> > 0 at 0@0 at -1@@-1@@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1@@@
>>>> > Perhaps this means that the error field among other things is not being populated?
>>>> > Then the question for Mike and/or Woodley would be if it is expected to be populated.
>>>> >
>>>> > On Fri, Sep 27, 2019 at 12:33 PM Stephan Oepen <oe at ifi.uio.no> wrote:
>>>> >>
>>>> >> hi kristen,
>>>> >>
>>>> >> i had to peak at the [incr tsdb()] code myself; 'Browse Errors' will
>>>> >> extract all items where the 'error' field (in the 'parse' relation) is
>>>> >> a non-empty string.  so, if nothing comes up there, presumably there
>>>> >> either were not errors, or ACE does not populate that field?
>>>> >>
>>>> >> likewise, the pre-canned 'unproblematic' condition amounts to 'error
>>>> >> == ""', i.e. an empty string in that field.  to some degree, what to
>>>> >> consider an 'error' is arguably up to the parsing engine.  from
>>>> >> memory, i believe that both the LKB and PET will generate some
>>>> >> descriptive 'error' string for example in case of missing lexical
>>>> >> entries for some of the input tokens.
>>>> >>
>>>> >> it appears that ACE (or pyDelphin, not sure about the division of
>>>> >> labor here) maybe simply does not populate the 'error' field in the
>>>> >> profiles that it generates?
>>>> >>
>>>> >> best wishes, oe
>>>> >>
>>>> >> On Fri, Sep 27, 2019 at 7:09 PM Kristen Howell <kphowell at uw.edu> wrote:
>>>> >> >
>>>> >> > Hi Mike and Woodley (and others?),
>>>> >> >
>>>> >> > I've created some itsdb profiles using pydelphin and a grammar loaded in ace. I am trying to browse the profile in [incr tsdb()]. The results and coverage show up fine. However, when I try to browse errors, nothing happens. Also when I try to view items with lexical coverage (using tsdl condition--> unproblematic and then browse --> test items), I see all of the items, not just those with lexical coverage.
>>>> >> >
>>>> >> > Is this expected to work with pydelphin profiles? If so, what might be missing? My profile contains non empty item, parse, result, relations, run files.
>>>> >> >
>>>> >> > Thanks for your help,
>>>> >> > Kristen
>> 
>> 
>> -- 
>> -Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190930/2a6d0cc7/attachment-0001.html>