[developers] Tranining a parse ranking model for Jacy
Dan Flickinger
danf at stanford.edu
Thu Jan 12 18:46:34 CET 2012
Hi Lea -
Could it be that you are accidentally using a thinned version of the gold profile for Japanese, where only the one selected parse for each item is stored in `result'? Just a thought.
Dan
----- Original Message -----
From: "Lea" <frermann at coli.uni-saarland.de>
To: developers at delph-in.net
Sent: Wednesday, January 11, 2012 9:12:32 PM
Subject: [developers] Tranining a parse ranking model for Jacy
Hello,
I am currently training two kinds of parse ranking models for Jacy and
the ERG:
(a) on gold annotated profiles of the (Japanese-English, parallel)
Tanaka corpus
(b) on Tanaka profiles which were treebanked automatically (using MRS
alignment)
In both cases the results for Japanese are worse than expected. I train
the same models on the same data in English for the ERG, and here
everything seems to work fine. I use the 'load' script and the
'train.lisp' script, which do both feature-caching and context-caching.
setting (a)
Training on gold annotated Tanaka profile 006, for Japanese only 1
feature per sentence is extracted during feature caching, while for
English the number looks reasonable. The model returned for Japanese is
tiny compared to the English one, and performs very poorly, as expected.
Japanese:
[11:44:48] operate-on-profiles(): running `pet' [30009000 - 30009200|.
[11:44:48] open-fc(): new BDB `fc.bdb'.
[11:44:48] cache-features(): item # 30009000: 1 event;
[11:44:48] cache-features(): item # 30009004: 1 event;
[11:44:48] cache-features(): item # 30009006: 1 event;
[11:44:48] cache-features(): item # 30009007: 1 event;
[11:44:48] cache-features(): item # 30009008: 1 event;
...
Events in = /tmp/.model.lfrermann.19628.events
Params out = /tmp/.model.lfrermann.19628.weights
Marginal = pseudo-likelihood
Smoothing = none
Procs = 1
Classes = 749
Contexts = 715
Features = 7 / 7
Non-zeros = 2197
English:
[11:45:59] operate-on-profiles(): running `pet' [30009000 - 30009200|.
[11:46:00] open-fc(): new BDB `fc.bdb'.
[11:46:00] cache-features(): item # 30009000: 11 events;
[11:46:00] cache-features(): item # 30009001: 6 events;
[11:46:01] cache-features(): item # 30009002: 11 events;
[11:46:01] cache-features(): item # 30009003: 11 events;
[11:46:01] cache-features(): item # 30009004: 2 events;
...
Events in = /tmp/.model.lfrermann.19803.events
Params out = /tmp/.model.lfrermann.19803.weights
Marginal = pseudo-likelihood
Smoothing = none
Procs = 1
Classes = 8739
Contexts = 1147
Features = 6595 / 8650
Non-zeros = 667364
setting (b)
When I train ranking models one automatically treebanked profile, for
both languages a reasonable number of parses is extracted (looking
similar to the English output above), and the model sizes are comparable:
Japanese:
Events in = /tmp/.model.lfrermann.20201.events
Params out = /tmp/.model.lfrermann.20201.weights
Marginal = pseudo-likelihood
Smoothing = none
Procs = 1
Classes = 4620
Contexts = 495
Features = 3432 / 4314
Non-zeros = 421357
English:
Events in = /tmp/.model.lfrermann.20140.events
Params out = /tmp/.model.lfrermann.20140.weights
Marginal = pseudo-likelihood
Smoothing = none
Procs = 1
Classes = 5067
Contexts = 617
Features = 4177 / 5234
Non-zeros = 342117
When I parse a test profile for English and Japanese using the
respective model, and compare the resulting ranks to the gold
annotations, I get 50% accuracy for English, but only 39% accuracy for
Japanese. The difference might be influenced by language specific
differences in training and evaluation, but still it seems too big to me.
I'd be very grateful for any suggested solutions (especially considering
the approaching ACL deadline (15.1.)).
Thank you very much for your help in advance!
Lea.
More information about the developers
mailing list