[developers] Browsing items with lexical coverage using itsdb profiles created with pydelphin/ace

Kristen Howell kphowell at uw.edu
Mon Sep 30 19:54:39 CEST 2019


Thanks, Woodley! I'll try using art... I can't believe I forgot about that
option. I'll follow up if I still have problems.

On Mon, Sep 30, 2019 at 10:31 AM Woodley Packard <sweaglesw at sweaglesw.org>
wrote:

> I get the error field populated when I use “art” to record profiles.  Are
> you passing —tsdb-notes to ace?  It may help.
>
> Woodley
>
> On Sep 30, 2019, at 7:59 AM, Kristen Howell <kphowell at uw.edu> wrote:
>
> Thanks Mike. You're right- the error information is showing up in stderr,
> rather than std out, so that is why PyDelphin isn't picking them up.
> So it sounds like I'm out of luck as far as generating profiles using Ace
> and then inspecting them with [incr tsdb()]. I will either need to use
> LBK/PET to parse, or look at the stderr from Ace to see my lexical coverage.
> Unless Woodley, is there a way command/option in Ace to send parse errors
> to stdout?
>
> On Fri, Sep 27, 2019 at 4:56 PM goodman.m.w at gmail.com <
> goodman.m.w at gmail.com> wrote:
>
>> Hi Kristen,
>>
>> The item file and the item schema in the relations file both have 15
>> fields, so I don't think there is disagreement there (although I had some
>> encoding issues with the angled quotes on a comment in the relations file;
>> i just fixed it manually).
>>
>> PyDelphin uses ACE's stdout protocols (see:
>> https://pydelphin.readthedocs.io/en/latest/api/delphin.ace.html#ace-stdout-protocols).
>> By default PyDelphin uses the --tsdb-stdout option of ACE to get as much
>> information as ACE can provide. If ACE provides the :error information,
>> PyDelphin will populate the corresponding field in a profile. From what I
>> recall, however, ACE does not output this field as consistently as the LKB
>> and PET, and sometimes it puts parsing errors on the stderr stream instead,
>> which PyDelphin does not capture.
>>
>>
>> On Sat, Sep 28, 2019 at 4:53 AM Kristen Howell <kphowell at uw.edu> wrote:
>>
>>> Perhaps there is some disagreement between my item and relations files?
>>> I generated the item file using the xigt exporter. I believe this is the
>>> corresponding relation file (it's the one I point to when using the
>>> exporter). I've attached both. I am creating the profile with the following
>>> steps (in python):
>>>  ts = itsdb.TestSuite('./unprocessed/wmb/')
>>>  ace.compile('./wmb/ace/config.tdl', './wmb/ace/wmb.dat')
>>>  with ace.AceParser('./wmb/ace/wmb.dat') as cpu:
>>>         ts.process(cpu)
>>>     ts.write(path='./output/processed/wmb'r)
>>>
>>>
>>> On Fri, Sep 27, 2019 at 1:06 PM Stephan Oepen <oe at ifi.uio.no> wrote:
>>>
>>>> yes, the 'parse' file (like the other files in a tsdb(1) database) is
>>>> a textual encoding of a set of tuples.  what you quote looks
>>>> suspiciously spartan to me, with only the first three fields filled
>>>> and the number of 'readings' filled in.  in a regular profile, i would
>>>> expect a record of the initial and internal tokenization, various
>>>> timings, and statistics about lexical instantiation and chart
>>>> construction.  i am relatively sure that ACE does account for most of
>>>> these, so i suspect that information is getting lost somewhere in your
>>>> pipeline.
>>>>
>>>> oe
>>>>
>>>> On Fri, Sep 27, 2019 at 9:56 PM Kristen Howell <kphowell at uw.edu> wrote:
>>>> >
>>>> > Thank you Stephan. Would the 'parse' relations be the lines the parse
>>>> file? They each look something like this:
>>>> > 0 at 0@0 at -1@@-1@@0 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@
>>>> -1 at -1@-1 at -1@-1 at -1@-1 at -1@-1 at -1@-1@@@
>>>> > Perhaps this means that the error field among other things is not
>>>> being populated?
>>>> > Then the question for Mike and/or Woodley would be if it is expected
>>>> to be populated.
>>>> >
>>>> > On Fri, Sep 27, 2019 at 12:33 PM Stephan Oepen <oe at ifi.uio.no> wrote:
>>>> >>
>>>> >> hi kristen,
>>>> >>
>>>> >> i had to peak at the [incr tsdb()] code myself; 'Browse Errors' will
>>>> >> extract all items where the 'error' field (in the 'parse' relation)
>>>> is
>>>> >> a non-empty string.  so, if nothing comes up there, presumably there
>>>> >> either were not errors, or ACE does not populate that field?
>>>> >>
>>>> >> likewise, the pre-canned 'unproblematic' condition amounts to 'error
>>>> >> == ""', i.e. an empty string in that field.  to some degree, what to
>>>> >> consider an 'error' is arguably up to the parsing engine.  from
>>>> >> memory, i believe that both the LKB and PET will generate some
>>>> >> descriptive 'error' string for example in case of missing lexical
>>>> >> entries for some of the input tokens.
>>>> >>
>>>> >> it appears that ACE (or pyDelphin, not sure about the division of
>>>> >> labor here) maybe simply does not populate the 'error' field in the
>>>> >> profiles that it generates?
>>>> >>
>>>> >> best wishes, oe
>>>> >>
>>>> >> On Fri, Sep 27, 2019 at 7:09 PM Kristen Howell <kphowell at uw.edu>
>>>> wrote:
>>>> >> >
>>>> >> > Hi Mike and Woodley (and others?),
>>>> >> >
>>>> >> > I've created some itsdb profiles using pydelphin and a grammar
>>>> loaded in ace. I am trying to browse the profile in [incr tsdb()]. The
>>>> results and coverage show up fine. However, when I try to browse errors,
>>>> nothing happens. Also when I try to view items with lexical coverage (using
>>>> tsdl condition--> unproblematic and then browse --> test items), I see all
>>>> of the items, not just those with lexical coverage.
>>>> >> >
>>>> >> > Is this expected to work with pydelphin profiles? If so, what
>>>> might be missing? My profile contains non empty item, parse, result,
>>>> relations, run files.
>>>> >> >
>>>> >> > Thanks for your help,
>>>> >> > Kristen
>>>>
>>>
>>
>> --
>> -Michael Wayne Goodman
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190930/0be4c49f/attachment.html>


More information about the developers mailing list