[developers] pyDelphin / [incr tsdb()] question

Stephan Oepen oe at ifi.uio.no
Fri Apr 12 02:09:06 CEST 2019


my apologies for the opaque inside joke, alexandre!  some of us used to
work at a start-up (called YY Technologies) a few years ago, developing the
premier email auto response solution at the time.  hence support for mbox
files was relevant at least back then :-).  and the ERG treebanks still
include four profiles with e-commerce emails.

regarding interpretation of the i-length field, i would maybe argue that
the two use cases ultimately are the same meaning: the field quantifies the
length of linguistic content; when there is nothing worse parsing in an
item, its ‘linguistic length’ is zero (or maybe -1, not quite sure just
now).  but, yes, this results in a flag-like behavior for the processing
commands: skipping over (what you might call ‘noise’) items which lack
linguistic content.

best, oe


On Thu, 11 Apr 2019 at 19:31 Alexandre Rademaker <arademaker at gmail.com>
wrote:

> Hi Stephan,
>
> > On 11 Apr 2019, at 13:52, Stephan Oepen <oe at ifi.uio.no> wrote:
> >
> > hi emily and mike,
> >
> > the [incr tsdb()] import facilities support some mixed-content document
> formats, notably the un*x mbox format (you can guess when that was useful
> functionality).
>
> Sorry, I don’t! Do you mean that mbox files can be imported to profiles
> directly? So far, I always thought that a profile itens are all sentences
> or phrases subject to be analysed by a grammar.
>
> >  to represent all data in the profile while not pretending that there is
> linguistic content (worth sending to the parser) in email headers, the
> corresponding items are marked as i-length = -1 (or maybe 0, not quite
> sure).  this is the reason for the ‘Process | ...’ commands to require a
> non-zero, positive length ... in other words a reassurance that there
> actually is linguistic content in the item.  in this regard, i-length (like
> i-id, i-input, and possibly i-wf as well) is a mandatory field in the item
> relation.
>
> So i-length has two meanings, it is at the same time the length of the
> input (in tokens) but also a flag. The -1 has special meaning, right?
>
> That is, what you are saying is that a profile can also accommodate noise
> data and we can explicit use the i-length to mark what itens are relevant
> for processing. Is that right?
>
> Best,
> Alexandre
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20190412/e09e21d2/attachment.html>


More information about the developers mailing list