<span style="font-family:arial, sans-serif;font-size:13px;border-collapse:collapse"><div>There are some confusing variable name conventions in the <a href="http://wiki.delph-in.net/moin/LogonModeling">LogonModeling</a> system that I am trying to document. If people can give me help sorting through the naming conventions I'll make sure it gets documented on the Wiki in a clear fashion.</div>
<div><br></div><div>When you want to use the Logon system to do parse or generation ranking experiments, you specify your experiment features as lisp variables in a feature grid file that looks like this:</div><div><br></div>
<div><div><font class="Apple-style-span" face="'courier new', monospace">(in-package :tsdb)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace">(load "parsing.lisp")</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"><br></font></div><div><font class="Apple-style-span" face="'courier new', monospace">(batch-experiment</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :source "jhpstg" :skeleton "jhpstg"</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> :nfold 10 :niterations 2 :type :mem</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :prefix "jhpstg"</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> :score-similarities nil</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :grandparenting '(0 2 3 4)</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> :active-edges-p '(nil t)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :lexicalization-p nil</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> :constituent-weight '(1 2 0)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :ngram-size '(0 2 3 4) :ngram-back-off-p '(nil t)</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> :lm-p nil</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :random-sample-size nil</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :counts-absolute 0 :counts-contexts 0 :counts-events 0 :counts-relevant 1</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> :variance '(nil 1e4 1e2 1e0 1e-2 1e-4 1e-6)</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> :relative-tolerance '(1e-6 1e-8 1e-10))</font></div>
<div><br></div><div>For each combination of variables, the Logon system produces a TSDB profile in a directory with a filename that looks like this:</div><div><br></div><div><font class="Apple-style-span" face="'courier new', monospace">[jhpstg] GP[3] +PT -LEX CW[] -AE NS[0] NT[] -NB LM[0] FT[:::1] RS[] MM[tao_lmvm] MI[5000] RT[1.0e-8] AT[1.0e-20] VA[1.0e-4] PC[100]</font></div>
<div><br></div><div>Most of the variables in the lisp file map to items in the output profile filename and vice versa, but the mappings can be obscure. The relevant functions that perform the mappings appear to be feature-environment, mem-environment, and svm-environment in learner.lisp, which unfortunately are not documented.</div>
<div><br></div><div>By working backwards from the sample feature grid lisp files and reading the source code, I've put together this table of correspondences: <a href="http://spreadsheets.google.com/pub?key=t7uBUaLi1Y6w5wYWmF_YrDg&output=html">http://spreadsheets.google.com/pub?key=t7uBUaLi1Y6w5wYWmF_YrDg&output=html</a>. Can people help me fill it out completely?</div>
<div><br></div><div>This spreadsheet uses the full directory names, not the compact directory names. It also doesn't list the SVM parameters. The Source column indicates the place where the experimenter specifies the parameter value. In this column, "grid file" means that the parameter is specified with a lisp variable in the feature grid file with the name that appears in the Lisp Parameter column.</div>
<div><br></div><div>Specific questions:</div><div><ol><li>Can you specify the feature parameters use-preterminal-types-p ngram-tag in the features grid file? (From the source it would appear so, but they don't show up in the sample feature grid lisp files.)</li>
<li>How do you specify absolute-tolerance and redwoods-train-percentage? (The source code lists these as mem-environment parameters, and it's unclear whether these are specified differently than the features.)</li><li>
What is the deal with the FT[:::] in the profile filenames? I find this completely cryptic, and the source code seems to indicate that it contains the lm-p value twice.</li></ol></div><div>Thanks.</div><div><span class="Apple-style-span" style="font-size: small;"><br>
</span></div></div></span><div>-- <br>W.P. McNeill<br><a href="http://staff.washington.edu/billmcn/index.shtml" target="_blank">http://staff.washington.edu/billmcn/index.shtml</a><br>Sent from Seattle, WA, United States
</div>