[developers] question about the statistical model

Francis Bond bond at ieee.org
Thu Sep 15 06:56:31 CEST 2011


On 13 September 2011 22:12, Montserrat Marimon
<montserrat.marimon at ub.edu> wrote:
> Hi,
> I have some problems training the statistical model and I'm not sure whether
> it's because I'm not using the proper data or because I'm using a lot of
> files.
> - should I first normalize the annotated data (using trees|normalize)? in
> which case the generated file.mem seems to be rather small

The trainer needs to see the gold tree and compare it to the non-gold
trees, so you should not normalize.

> - is there a limit in the number of files (actually directories) in the
> virtual file?

I don't know.  If you are having troubles with the virtual profiles,
you should be able to concatenate everything into a single file, as
long as have unique IDs.

Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University

More information about the developers mailing list