[developers] Increasing parse coverage
gete2 at cam.ac.uk
Sat Apr 4 18:01:57 CEST 2015
I'm trying to use the ERG to produce DMRSs for a sentiment analysis task.
However, I'm getting relatively low coverage at the moment.
I have run ACE with a freshly downloaded pre-compiled ERG as follows:
ace -g erg.dat -1Tq filename
ace -g erg.dat -1Tq -r "root_strict root_frag root_informal root_inffrag"
In the first case, I got 64.4% coverage, and in the second, 86.4%.
Are there are any further tricks I could use to improve coverage? I'm
using the Stanford Sentiment Treebank, and I've put a 'sentence'-segmented
version of the text here:
Many lines are noun phrases or adjective phrases. There are also a lot of
make-it-up-as-you-go hyphenated tokens.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the developers