[developers] updates to LKB and [incr tsdb()] code base
Stephan Oepen
oe at csli.Stanford.EDU
Wed Feb 8 18:06:59 CET 2006
dear all,
preparing for a DELPH-IN developers gathering later this month, i just
merged a number of code changes into the LKB and [incr tsdb()] trees.
the potentially relevant LKB changes (that i remember) are:
- fixes to SPPP and orthographemics component for korean (see recent
emails on the `developers' list): deep orthographemic chains and
token-level ambiguity now work.
- substituting a *statistics* object for globals like *unfications*,
*copies*, et al. (see `lkb/src/main/statistics.lsp'). processing
that by-passes parse() or chart-generate() should now make sure to
execute reset-statistics() upon initialization.
- wrapping a function originally provided by ben into a macro, so as
to allow grammars to request the appropriate coding system, e.g.
(when (lkb-version-after-p "2006/02/08 15:00:00")
(set-coding-system utf-8))
i would recommend putting a form like this at the top of `script'
in each grammar, such that this central property is made explicit.
- adding the FAD library into the tree (used in [incr tsdb()]).
- moving to ACL 8.0 for builds on Linux (x86, 32- and 64-bit).
for [incr tsdb()], the changes are more dramatic but should (hopefully)
all be backwards-compatible. all skeletons are new and now include an
extra `fold' relation (used in ML experimentation). my original, half-
baked support for MaxEnt experimentation has been replaced with a fresh
re-write, jointly authored by erik velldal and myself. the immediate
effect is that `Trees | Train' no longer works (the way it used to) and
that the external format for ME features and models has changed. also,
functions like read-mem(), mem-score-edge(), et al. are gone, but i am
assuming that no-one (but the ERG) were actually using these?
we will have more to say on these new [incr tsdb()] ML experimentation
tools in the near future, and i would like to know who in the past has
actually used the original `Train', `Rank', and `Score' functionality.
however, if you were planning to re-generate MaxEnt models the next few
days, i would advise you to _not_ cvs update right now.
all best - oe
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2285 7989
+++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++ --- oe at csli.stanford.edu; oe at ifi.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
More information about the developers
mailing list