[developers] [itsdb] Howto train a model on more than one profile

Stephan Oepen oe at csli.Stanford.EDU
Mon Nov 13 18:44:31 CET 2006


hi again,

having talked to berthold by phone today, i now have a few more answers
to his questions :-).

> As a quick workaround: would it work to just start 4 lisp processes with
> 4 different configuration files, say one for each level of
> grandparenting? 

i expect that /should/ work: all temporary file names are made private
by means of the pid of the master process; likewise the context caches.
the only thing that could go wrong, i think, would be the BDB layer, in
case multiple sessions were to read the feature cache at the same time.
the current code is not smart enough to access the BDB read-only, and i
imagine BDB might reject simultaneous opens for write access (or not).

> >  :constituent-weight '(1 2 0)
> 
> What does this feature do?

it is a highly experimental attempt at including a metric of constiuent
weight in configurational features.  so far we have been unable to gain
significantly from such features (somewhat to our surprise).

> Sure. How can I inspect the result of the experiments? Do I have to
> process the log files, or can I also inspect the tsdb profiles?

it turns out that profiles for MLE usage must have up-to-date database
schemata, i.e. have a `fold' relation and a `flags' field in `result'.
once those are there, per-fold summaries are recorded in each profile,
and something like the following can provide a quick summary:

  for i in \[eiche\]*; do
    echo $i;
    tsdb -home="$i" -query="select f-accuracy f-extras";
  done

furthermore, summarize-folds() is a useful function once all profiles
are in place.

> I would also like to do an error analysis. Is that possible with the
> virtual profile setup, or will I end up creating this one large
> profile? 

the error analysis code has not been actively used in a while, but i
imagine it should still work okay.  given the lack of visibility for
virtual profiles in the [incr tsdb()] podium, one would have to call
analyze-errors() in Lisp, providing appropriate arguments.

> Another question: what happens with partially disambiguated and/or
> rejected parses. Is there a way to see how they contribute to the end
> result? Are they ignored? 

rejected inputs are ignored, as there is nothing for the conditional 
model to learn in these items.  items with multiple `gold' trees are
used as such in training and treated `appropriately' in evaluating.
appropriately here means that there is no discounting for cases where
the model ranks multiple trees at the top, and all of them are `gold'.
conversely, not ranking all `gold' trees top is not penalized.

> Finally, there are two measures of accuracy reported in the log file:
> eaccuracy and naccuracy. How doi they relate to each other?

there are three, in face:

  f-accuracy: the exact match accuracy computed by summarize-scores(),
  i.e. on the basis of scores recorded in the [incr tsdb()] database;

  n-accuracy: n-best exact match accuracy, i.e. the same as before but
  taking into account the five top-ranked candidates;
  
  e-accuracy: exact match accuracy as computed by TADM evaluate.

hence it should in principle be the case that

  e-accuracy == f-accuracy << n-accuracy

we have observed minor discrepancies (of less than one percent point) 
between the first two, and i am not loosing sleep over that (there can
be more than one way of discounting).  but they should always be in the
same ballpark!

                                                           best  -  oe
              
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at csli.stanford.edu; oe at ifi.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



More information about the developers mailing list