[developers] SOLVED: Re: Mysterious generation problem in LOGON

Fri Feb 8 23:27:14 CET 2008

hi berthold,

> today I discovered the reason for the strange behaviour I reported a few
> days ago. 
> 
> 1. The reason for the difference in behaviour between the source and
> binary version of LOGON was a configuration issue on my side. 
> :logon was not pushed as a feature, and a look at the code revealed that
> score-strings indeed behaves differently depending on this feature. 

i feel like clarifying the relationship between the standard versions
of the LKB and [incr tsdb()], and the LOGON tree: the LOGON tree is a 
collection of DELPH-IN resources, partly pre-compiled for Linux, that
has been tested for interoperability.  typically, the source in either
version is identical, unless (a) within a LOGON development cycle some
external code changes have not been picked up yet, or internal changes
have not been propagated to the mainline yet.  typically, each cycle is
a period of three to six months (and for significant changes, i have at
times synchronized the two universes within such a cycle).  second, (b)
the LOGON tree adds some new functionality that is not supported in the
bare versions of the LKB and [incr tsdb()].  for example, these include
the support for MRS or realization ranking using LMs, and the code that
allows experimentation with MaxEnt or SVM training and evaluation.

parts of this functionality are conditioned on the compile-time feature
:logon, as you discovered.  for those few people using a LOGON tree and
`bare' LinGO tree side-by-side, it is vital to make sure the feature is
present in the LOGON tree, and /not/ present otherwise.

> 2. The difference between the LOGON and the standard version of LKB
> appears to be real: when confronted with an unspecified PRED value (in
> C-CONTS), standard LKB still happily generates (the value got specified
> by some other rule) as expected, whereas LOGON does not. I suspect this
> is due to a difference in indexing, but - not being an LKB developer - I
> can only speculate here. 
> Introducing a more specific PRED value in the wh-h rule solved the
> problem for LOGON as well. 

this one is interesting.  you actually found one of the few differences
in the code:

  (defun extract-pred-from-rel-fs (rel-fs &key rawp)
    (let* ((label-list (fs-arcs rel-fs))
           (pred (rest (assoc (car *rel-name-path*) label-list)))
           (pred-type (if pred (fs-type pred))))
      (when rawp (return-from extract-pred-from-rel-fs pred))
      (if (and pred-type
               (not (is-top-type pred-type))
               #+:logon
               (not (is-top-semantics-type pred-type)))
        pred-type
        (unless *rel-name-path* (fs-type rel-fs)))))

`predsort' in your example is exactly the top semantics type, and the
additional test above means that, in the LOGON tree, [ PRED predsort ]
results in an MRS with an empty PRED value.  this is necessary for the
creation of some transfer rules, e.g. where an OUTPUT PRED is provided
by a variable.  but it also means that, in the LOGON universe, a rule
or lexical entry with a maximally unspecific PRED will not be indexed,
where based on subsumption of predicates it should be indexed as being
activated by /any/ possible PRED value.

apparently, at the time i did not think any grammar could possibly not
specify more specific PREDs on all rules and lexical entries, but now i
think the LOGON behavior is formally wrong.

it still has to be the case that in the description of /transfer rules/
[ PRED predsort ] is treated specially, but i should look for a way of
making that true without affecting other processes that read off an MRS
from a feature structure.

in conclusion, i would call your first problem a pilot error, while the
second one uncovered a bug in the LOGON code.

> PS: Could there not be a warning message, whenever PRED predsort is
> detected?

so, even if we judge [ PRED predsort ] a legal value, i suspect it will
rarely if ever be intended in a rule or lexical entry.  so, this makes
a feature request then :-).  i suspect that rules indexed on `predsort'
can make generation quite slow, as there will be one copy of each such
rule /per/ EP in the input MRS.  hence, such a warning would seem like
a good idea to me.  ann or john, any comments from your end?

                                                      all best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++