[logon] The highest accuracy on the JHPSTG corpus is 40.6. Is this a bug?

Bill McNeill (UW) billmcn at u.washington.edu
Fri Apr 10 23:09:47 CEST 2009


I ran a set of parse ranking experiments on the JHPSTG corpus.  I chose a
reasonable set of learning parameters and ranged over the machine learning
priors.  The highest accuracy I got was 40.6.  This seems very low, and way
beneath the state of the art.  Do I have a bug?
Here's the batch-experiment call I use:

(in-package :tsdb)

(load "parsing.lisp")

(batch-experiment
 :source "jhpstg" :skeleton "jhpstg"
 :nfold 10 :niterations 2 :type :mem
 :prefix "jhpstg"
 :score-similarities nil
 :grandparenting 4
 :active-edges-p t
 :lexicalization-p nil
 :constituent-weight 0
 :ngram-size 4 :ngram-back-off-p t
 :lm-p nil
 :random-sample-size nil
 :counts-absolute 0 :counts-contexts 0 :counts-events 0 :counts-relevant 1
 :variance '(nil 1e4 1e2 1e0 1e-2 1e-4 1e-6)
 :relative-tolerance '(1e-6 1e-8 1e-10))

I ran scoring with the following lisp script:

(setf *tsdb-home* "/home/billmcn/logon/lingo/redwoods/tsdb/home")
(summarize-folds :output "/home/billmcn/temp/jhpstg.results" :pattern
"\\[jhpstg\\]")

I got the following results:

39.449802 9.746283 13.783325 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6]
AT[1.0e-20] VA[1.0e-6] PC[100]'
38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6]
AT[1.0e-20] VA[1.0e-4] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6]
AT[1.0e-20] VA[1.0e+4] PC[100]'
38.173570 4.652545 6.579693 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[1.0e-2] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[1.0e+0] PC[100]'
38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[1.0e+0] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[1.0e+4] PC[100]'
40.612595 8.101849 11.457745 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[1.0e-4] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6]
AT[1.0e-20] VA[1.0e+2] PC[100]'
38.173570 4.652545 6.579693 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6]
AT[1.0e-20] VA[1.0e-2] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-6]
AT[1.0e-20] VA[1.0e+0] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[1.0e+4] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[1.0e+2] PC[100]'
38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[] PC[100]'
34.571754 2.847678 4.027225 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[1.0e+2] PC[100]'
38.287010 11.390718 16.108908 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-10]
AT[1.0e-20] VA[1.0e-6] PC[100]'
38.173570 4.652545 6.579693 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[1.0e-2] PC[100]'
40.612595 8.101849 11.457745 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[1.0e-4] PC[100]'
39.449802 9.746283 13.783325 `/[jhpstg] GP[4] +PT -LEX CW[] +AE NS[4]
NT[type] +NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8]
AT[1.0e-20] VA[1.0e-6] PC[100]'


-- 
W.P. McNeill
http://staff.washington.edu/billmcn/index.shtml
Sent from Seattle, WA, United States
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.emmtee.net/archives/logon/attachments/20090410/4a3e5c3c/attachment.html>


More information about the logon mailing list