[erg] jul-07 release of ERG

Dan Flickinger danf at csli.stanford.edu
Tue Jul 17 00:27:41 CEST 2007


Dear colleagues -

I am pleased to announce the release of the "jul-07" version of the ERG,
a minor update to the first message-free version released in March, with
some bug fixes and expanded lexical coverage for several additional 
treebanked corpora, including

  (1) FraCaS - Ann Copestake's student Richard Bergmair has collected all of
      the examples sentences from FraCaS (cf. www.cogsci.ed.ac.uk/~fracas);
      the treebank for all of these 325 items is in erg/gold/fracas.
  (2) Senseval 2-4 - Tim Baldwin and his student David Martinez have provided
      an ERG-compatible tokenization for the data used in the Senseval workshops
      2, 3, and 4, and the treebanks for these three data sets are in
      erg/gold/seval2, seval3, and seval4.  Note that the ERG currently produces
      good analyses for 80% of this data, as follows:

	        Profile        Total     Good ERG
		               # items   analyses
		Senseval-2     242        198
		Senseval-3     327        268
		Senseval-4     135         99
			       ---        ---
		     total     704        567

  (3) SciBorg - Ann Copestake and colleagues in the SciBorg project at
      Cambridge have prepared some data sets from scientific texts in
      chemistry which the ERG is being applied to, though this data cannot
      be distributed, at least not yet.

  (4) Acrolinx - Ulrich Callmeier and colleagues at the Acrolinx software
      company in Berlin have data for controlled language checking to which
      the ERG is being applied as part of an in-house R&D effort, but this 
      data also cannot be distributed.

Note that the maximum entropy model released with this version has not yet 
been updated to reflect the current grammar, but should work reasonably well 
until the new model has been built and validated.

Note further that the treebanks in erg/gold will only behave properly with
[incr tsdb()] once you have an up-to-date version of the LKB, but don't rush -
that compatible version will not be in CVS for another few days yet.  I'll 
announce when it's ready.  In the meantime, you should be able to use this
ERG for everything except grammar profiling and treebanking.

I look forward to hearing of your experiences with this new release.

 Dan




More information about the erg mailing list