[developers] updates to LKB and [incr tsdb()] code base

Wed Feb 8 18:06:59 CET 2006

dear all,

preparing for a DELPH-IN developers gathering later this month, i just
merged a number of code changes into the LKB and [incr tsdb()] trees.

the potentially relevant LKB changes (that i remember) are:

  - fixes to SPPP and orthographemics component for korean (see recent
    emails on the `developers' list): deep orthographemic chains and
    token-level ambiguity now work.

  - substituting a *statistics* object for globals like *unfications*,
    *copies*, et al. (see `lkb/src/main/statistics.lsp').  processing 
    that by-passes parse() or chart-generate() should now make sure to
    execute reset-statistics() upon initialization.

  - wrapping a function originally provided by ben into a macro, so as
    to allow grammars to request the appropriate coding system, e.g.

    (when (lkb-version-after-p "2006/02/08 15:00:00")
      (set-coding-system utf-8))

    i would recommend putting a form like this at the top of `script'
    in each grammar, such that this central property is made explicit.

  - adding the FAD library into the tree (used in [incr tsdb()]).

  - moving to ACL 8.0 for builds on Linux (x86, 32- and 64-bit).

for [incr tsdb()], the changes are more dramatic but should (hopefully)
all be backwards-compatible.  all skeletons are new and now include an
extra `fold' relation (used in ML experimentation).  my original, half-
baked support for MaxEnt experimentation has been replaced with a fresh
re-write, jointly authored by erik velldal and myself.  the immediate
effect is that `Trees | Train' no longer works (the way it used to) and
that the external format for ME features and models has changed.  also,
functions like read-mem(), mem-score-edge(), et al. are gone, but i am
assuming that no-one (but the ERG) were actually using these?

we will have more to say on these new [incr tsdb()] ML experimentation
tools in the near future, and i would like to know who in the past has
actually used the original `Train', `Rank', and `Score' functionality.
however, if you were planning to re-generate MaxEnt models the next few
days, i would advise you to _not_ cvs update right now.

                                                       all best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2285 7989
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at csli.stanford.edu; oe at ifi.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++