[developers] reforestation of gold profiles

Francis Bond bond at ieee.org
Wed Sep 5 08:17:56 CEST 2012


I would like to reforest (the opposite of thin) some gold profiles, so
that I can do some parse ranking experiments.  My understanding is
that I:
(i) parse the test suite with the same grammar
(ii) update the new profile with the gold

I have two questions about details:

(i) which grammar (cpu) was used for the cb and sc0[123] profiles in
the up-to-date logon distribution (or how to you find out)?  Just
looking at the run file did not make things very clear.   If exactly
the same grammar is not available, what should I do?

(I am trying using 'cheap' but am getting several unknown words, which
I had expect the unknown word handling to handle (at least it did in
the gold profile).  This does not bode well).

(ii) which flags should I use when updating?  I assume automatic
update, but I am not sure if I want explicit or implicit ranks, or
result identity or equivalence?

Finally, if someone (Dan) should have a stash of unthinned profiles,
please let me know.


Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University

More information about the developers mailing list