[developers] [erg] jul-07 release of ERG
Stephan Oepen
oe at csli.stanford.edu
Sat Aug 4 13:55:04 CEST 2007
hei!
a few days ago, i checked into the LinGO CVS the result of merging
back accumulated changes from LOGON tree. these include the update
in the [incr tsdb()] format for storing derivations as sketched in my
17-jul message (see below). further changes in LKB code are changes
i had anticipated several weeks ago, viz. removing the long-deprecated
fill-mrs() and unfill-mrs() functionality (use a VPM instead: LogonVpm
on the DELPH-IN wiki), and eliminating the `rel' vs. `char-rel' distinction
in MRS internals (see earlier messages to the `developers' list). people
with cached generator indices will need to rebuild these files; one way
of doing that could be through a command like
rm ~/tmp/*.ric ~/tmp/*.stc
regarding the change in UTF, i have also checked in the corresponding
code changes to PET (into the DFKI SVN repository), so i recommend
people plan on installing fresh copies of everything, once they upgrade.
besides the above changes in LKB code, there were numerous updates
to the MT component (bug fixes and improvements in transfer, better LM
support, and a minor VPM fix) and to [incr tsdb()] (mostly to do with its
MT support, treebanking, and stochastic experimentation); as most of
this functionality is only active in the full LOGON tree anyway, where it
has been in use for many months already, i will not post a summary of
changes here.
i am off for a two-week vacation, so will likely not read email before the
DELPH-IN Summit later this month.
all best, oe
On 7/17/07, Stephan Oepen <stephan.oepen at iln.uio.no> wrote:
>
> hei!
>
> here is a follow-up with some background to what dan emailed today:
>
> > I am pleased to announce the release of the "jul-07" version of the
> > ERG, a minor update to the first message-free version released in
> > March, some bug fixes and expanded lexical coverage for several
> > additional treebanked corpora, [...]
> >
> > Note further that the treebanks in erg/gold will only behave properly
> > with [incr tsdb()] once you have an up-to-date version of the LKB,
> > but don't rush - that compatible version will not be in CVS for
> > another few days yet. I'll announce when it's ready. In the
> > meantime, you should be able to use this ERG for everything except
> > grammar profiling and treebanking.
>
> regarding treebanks, the relevant changes are all in [incr tsdb()] (not
> the LKB), but as both reside in the same CVS repository, i recommend to
> prepare bringing /everything/ up-to-date (also, dan has commited a few
> changes to LKB code, which he plans to announce separately).
>
> my treebank change is in the derivation format recorded in the profile
> database (in the `result' relation). the new, extended format (dubbed
> UDF 1.2) includes the start symbol of the grammar used to license each
> derivation, e.g.
>
> (root_strict
> (49 subjh -1.03461 0 2
> (46 proper_np -1.15336 0 1
> (45 sing_noun_irule -0.862381 0 1 (3 kim 0 0 1 ("kim" 0 1))))
> (48 punct_period_orule 0.121598 1 2
> (47 third_sg_fin_verb_orule 0.061442 1 2
> (9 sleep_v1 0 1 2 ("sleeps." 1 2))))))
>
> compared to the earlier format (UDF 1.1), the `root_strict' node at the
> top is new, thus whoever reads (or writes) derivations in [incr tsdb()]
> profiles needs to be aware of this change.
>
> as always, old code will not be able to read new profiles, but new code
> is backwards compatible to old profiles (back to around 1997). hence,
> to use [incr tsdb()] on the latest ERG profiles, you will need to move
> to the latest code base, /once/ i release it later this week. however,
> running this latest ERG, including creating your own new profiles, can
> still be done with older versions of [incr tsdb()], PET, or the LKB.
>
> to write out UDF 1.2 derivations, PET also needs to be changed. i plan
> to check in both my [incr tsdb()] and PET updates later this week.
>
> all best - oe
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284
> 0125
> +++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
> +++ --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net ---
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20070804/35993fc2/attachment.html>
More information about the developers
mailing list