[developers] Problem with Logon ltriples

Tue Jul 26 12:35:23 CEST 2011

hi sergio,

> I'm trying to run Rebecca’s Elementary Dependency Match with out NLX corpus
> and I came up with a problem.
> I follow her instructions and export the profiles with ltriples format,
> however the output does
> not have the links <number:number> she requires. Is there any easy way to
> obtain this or is something I need to change in the logon/grammar?

i am copying the 'developers' list, as i am guessing you work from
these instructions (by rebecca) here

  http://wiki.delph-in.net/moin/ElementaryDependencyMatch

so your question about (missing) characterization could also be of
interest for other grammars.

i must admit i am no expert on the current technical details of the
NLX grammar, but for all i recall it used to take advantage of some
procedural 'magic' in PET to integrate information originating from
pre-processing, for example to determine CARG or PRED values on
EPs associated to unknown words, and presumably also to obtain
characterization (i.e. links to sub-strings of the original input to the
parser, in terms of character ranges).  in this setup, the derivations
recorded in [incr tsdb()] profiles do not record complete information
about the history of each parse, hence the procedural magic that
PET applied at parse time cannot be re-played by [incr tsdb()] and
thus the full feature structures (and correspondingly MRSs) for
parse results cannot be rebuilt when [incr tsdb()] reconstructs the
derivations, e.g. for the purpose of exporting from a profile.  there
is some more discussion of the technical issues involved here:

  http://wiki.delph-in.net/moin/PetInput
  http://lists.delph-in.net/archive/developers/2010/001440.html

EDM (in its current form) pretty much requires characterization to
be applied meaningfully, hence one would need to rework the NLX
approach to interfacing pre-processing and PET to overcome the
problem you are facing.  in general, i would still recommend that
grammarians aim to move into the chart mapping universe (and
use either the FSC or YY formats to give inputs to PET), but i do
of course understand that not everyone will have the opportunity
or time to make this transition immediately.

but in your case, maybe, not all is grim: i also recall that your
group had some post-processing magic of [incr tsdb()] profiles
to 'augment' parsing results stored there, specifically MRSs?  i
would certainly expect the [incr tsdb()] triples export to include
characterization when working from a profile that already stores
MRSs whose predicates bear characterization information.  for
what the correct format would be, please see the MRS example
on the PetInput page linked above.  hence, assuming you can
produce an [incr tsdb()] profile that has complete MRSs, then
the triples export should have the desired effect (in this setup,
[incr tsdb()] need not reconstruct derivations and recompute
MRSs; instead, it will just use MRSs as found in the profile).

best wishes, oe