[itsdb] Fine system gymnastics

Emily M. Bender ebender at u.washington.edu
Thu Sep 28 06:26:25 CEST 2006


Hi,

We're working on putting together a complicated test suite
for matrix testing (including the "phenomena-extensions"). I'm
going to briefly describe what we want to do in this message,
in the hopes that the fine system already does some of it, and
that I can be enlightened:

1. We have a list of "seed strings", which all use a 
generic pos tag vocabulary.
2. We want to create and extract gold-standard MRSs for each
seed string.  This will require the use of different matrix-derived
grammar starts for different strings.
3. We will create variations on the seed strings (i.e., but
permuting the order of the elements), copying the MRS for each
seed string and associating it to its variants.
4. We will assign grammaticality status to the string-MRS pairs,
according to two steps:

	-- global ungrammaticality
	-- ungrammaticality relative to one of a set of test grammars

Regarding step 4, it seems to me that we ought to be able to
use either i-wf or i-comment to store codes that correspond to
the grammaticality status, and then do custom queries that 
interpret those codes.  Does i-wf only accept 0,1,2 as values,
or is it any number (or even any string?).

Regarding step 2, I see to recall that there is some way to
store the MRSs for a test run, but I'm not turning it up on the wiki.
Combined with treebanking to verify that we have the ones that we want
and thinning, this ought to get it.  

Finally, since we're dealing with different grammars in the first
place, we'll want some way to combine [incr tsdb()] profiles to create
one large profile.  Has this been attempted before?

Thanks,
Emily




More information about the itsdb mailing list