[developers] Increasing parse coverage
Guy Emerson
gete2 at cam.ac.uk
Sat Apr 4 18:01:57 CEST 2015
I'm trying to use the ERG to produce DMRSs for a sentiment analysis task.
However, I'm getting relatively low coverage at the moment.
I have run ACE with a freshly downloaded pre-compiled ERG as follows:
ace -g erg.dat -1Tq filename
ace -g erg.dat -1Tq -r "root_strict root_frag root_informal root_inffrag"
filename
In the first case, I got 64.4% coverage, and in the second, 86.4%.
Are there are any further tricks I could use to improve coverage? I'm
using the Stanford Sentiment Treebank, and I've put a 'sentence'-segmented
version of the text here:
https://raw.githubusercontent.com/guyemerson/Sentimantics/master/data/sentibank.txt
Many lines are noun phrases or adjective phrases. There are also a lot of
make-it-up-as-you-go hyphenated tokens.
Best,
Guy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20150404/d676825d/attachment.html>
More information about the developers
mailing list