[developers] Increasing parse coverage
Ann Copestake
aac10 at cam.ac.uk
Sat Apr 4 23:38:31 CEST 2015
I hope people more knowledgeable than I am will comment, but:
- titles without quotes won't help - I don't know if there's any mileage
in trying to insert these via a list of movie names.
- there are many very bizarre uses of ... - any idea where they came
from? It doesn't look like normal punctuation use to me. I would be
tempted to just try removing all of them ...
Ann
On 2015-04-04 17:01, Guy Emerson wrote:
> I'm trying to use the ERG to produce DMRSs for a sentiment analysis
> task. However, I'm getting relatively low coverage at the moment.
>
> I have run ACE with a freshly downloaded pre-compiled ERG as follows:
>
> ace -g erg.dat -1Tq filename
> ace -g erg.dat -1Tq -r "root_strict root_frag root_informal
> root_inffrag" filename
>
> In the first case, I got 64.4% coverage, and in the second, 86.4%.
>
> Are there are any further tricks I could use to improve coverage? I'm
> using the Stanford Sentiment Treebank, and I've put a
> 'sentence'-segmented version of the text here:
>
> https://raw.githubusercontent.com/guyemerson/Sentimantics/master/data/sentibank.txt
> [1]
>
> Many lines are noun phrases or adjective phrases. There are also a
> lot of make-it-up-as-you-go hyphenated tokens.
>
> Best,
> Guy.
>
> Links:
> ------
> [1]
> https://raw.githubusercontent.com/guyemerson/Sentimantics/master/data/sentibank.txt
More information about the developers
mailing list