[itsdb] Fine system gymnastics

Francis Bond fcbond at gmail.com
Thu Sep 28 06:48:55 CEST 2006


G'day,

sounds great.

> We're working on putting together a complicated test suite
> for matrix testing (including the "phenomena-extensions"). I'm
> going to briefly describe what we want to do in this message,
> in the hopes that the fine system already does some of it, and
> that I can be enlightened:
>
> 1. We have a list of "seed strings", which all use a
> generic pos tag vocabulary.
> 2. We want to create and extract gold-standard MRSs for each
> seed string.  This will require the use of different matrix-derived
> grammar starts for different strings.
> 3. We will create variations on the seed strings (i.e., but
> permuting the order of the elements), copying the MRS for each
> seed string and associating it to its variants.
> 4. We will assign grammaticality status to the string-MRS pairs,
> according to two steps:
>
>         -- global ungrammaticality
>         -- ungrammaticality relative to one of a set of test grammars
>
> Regarding step 4, it seems to me that we ought to be able to
> use either i-wf or i-comment to store codes that correspond to
> the grammaticality status, and then do custom queries that
> interpret those codes.  Does i-wf only accept 0,1,2 as values,
> or is it any number (or even any string?).
>
> Regarding step 2, I see to recall that there is some way to
> store the MRSs for a test run, but I'm not turning it up on the wiki.
> Combined with treebanking to verify that we have the ones that we want
> and thinning, this ought to get it.

Please see:
http://wiki.delph-in.net/moin/ItsdbCustomization?highlight=%28semantix%29

feel free to move it somewhere else, duplicate it or link to it if it
makes it easier to find.

> Finally, since we're dealing with different grammars in the first
> place, we'll want some way to combine [incr tsdb()] profiles to create
> one large profile.  Has this been attempted before?

This could be the chance for Stephan to reveal the secrets of virtual
profiles...


-- 
Francis Bond  <www.kecl.ntt.co.jp/icl/mtg/members/bond/>
NTT Communication Science Laboratories | Natural Language Research Group



More information about the itsdb mailing list