[developers] Increasing parse coverage

Guy Emerson gete2 at cam.ac.uk
Sat Apr 4 18:01:57 CEST 2015


I'm trying to use the ERG to produce DMRSs for a sentiment analysis task.
However, I'm getting relatively low coverage at the moment.

I have run ACE with a freshly downloaded pre-compiled ERG as follows:

ace -g erg.dat -1Tq filename
ace -g erg.dat -1Tq -r "root_strict root_frag root_informal root_inffrag"
filename

In the first case, I got 64.4% coverage, and in the second, 86.4%.

Are there are any further tricks I could use to improve coverage?  I'm
using the Stanford Sentiment Treebank, and I've put a 'sentence'-segmented
version of the text here:

https://raw.githubusercontent.com/guyemerson/Sentimantics/master/data/sentibank.txt

Many lines are noun phrases or adjective phrases.  There are also a lot of
make-it-up-as-you-go hyphenated tokens.

Best,
Guy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20150404/d676825d/attachment.html>


More information about the developers mailing list