I have experienced empty fold and score files with the following error occurring inside the log file.<br><br>batch-experiment(): error: `learner-rank-items():<br> mysterious<br> score<br>
deficit'.<br><br>Best,<br>Faisal<br><br><div class="gmail_quote">On Mon, Feb 27, 2009 at 19:08 PM, Bill McNeill (UW) <<a href="mailto:developers%40delph-in.net?Subject=%5Bdevelopers%5D%20What%20errors%20can%20cause%20a%20grid%20parse%20reranking%0A%09process%20to%20return%20an%20empty%20scores%20file%3F&In-Reply-To=200902271657.n1RGvDrs025375%40mv.emmtee.net" title="[developers] What errors can cause a grid parse reranking        process to return an empty scores file?">billmcn at u.washington.edu</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><pre><br>Any error messages in particular I should look out for in the log files. (I<br>didn't see anything obvious like "Out of memory!")<br>
<br>On Fri, Feb 27, 2009 at 8:57 AM, Stephan Oepen <<a href="http://lists.delph-in.net/mailman/listinfo/developers" target="_blank">oe at ifi.uio.no</a>> wrote:<br>
<br>><i> hi again, bill,<br></i>><i><br></i>><i> > I am running many different grid* files in parallel using the Condor<br></i>><i> > distributed computing system. On my latest round of jobs, all of my<br>
</i>><i> > Condor jobs completed, but when I look at the directories created<br></i>><i> > under ~/logon/lingo/redwoods/tsdb/home, I find that most of them have<br></i>><i> > empty score files. I take the empty score files to be a sign that<br>
</i>><i> > something didn't work.<br></i>><i><br></i>><i> yes, i would say they indicate a failed experiment. there are quite a<br></i>><i> few ways in which individual experiments can fail, without the complete<br>
</i>><i> job necessarily failing. Lisp may run out of memory at some point, but<br></i>><i> `recover' from that and carry on; for example, when reading the profile<br></i>><i> data to start an experiment, a fresh Lisp process will almost certainly<br>
</i>><i> need to grow substantially. with limited RAM and swap space, that may<br></i>><i> fail, but an `out of memory' error may be caught by the caller, where i<br></i>><i> can say for sure that [incr tsdb()] frequently catches errors, but i am<br>
</i>><i> less confident (off the top of my head) about how these will be handled<br></i>><i> in the context of feature caching and ME grid searches. our `approach'<br></i>><i> has typically been lazy: avoid errors of this kind, hence it is quite<br>
</i>><i> likely that they are not handled in a very meaningful way. experiments<br></i>><i> might end up being skipped, or even executed with incomplete data ...<br></i>><i><br></i>><i> in a similar spirit, the parameter searches call tadm and evaluate many<br>
</i>><i> times, and either one could crash (insufficient memory or disk space in<br></i>><i> `/tmp'), and again i cannot really say how that would be handled. i am<br></i>><i> afraid, my best recommendation is to (a) inspect the log files created<br>
</i>><i> by the `load' script and (b) try to create an environment for such jobs<br></i>><i> where you are pretty confident you have some remaining headroom. from<br></i>><i> my experience, i would think that means a minimum of 16 gbytes in RAM,<br>
</i>><i> generous swap space (on top of RAM), and at least several gigabytes of<br></i>><i> disk space in `/tmp'.<br></i>><i><br></i>><i> as regards your earlier (related) question about resource usage:<br>
</i>><i><br></i>><i> > What was your memory high water mark during test (as opposed to<br></i>><i> > training)?<br></i>><i><br></i>><i> memory consumption will depend on two parameters: the total number of<br>
</i>><i> results (i.e. distinct trees: `zcat result.gz | wc -l'), and how many<br></i>><i> feature templates are active (e.g. levels of grandparenting, n-grams,<br></i>><i> active edges, constituent weight). i have started to run experiments<br>
</i>><i> again myself, and i notice that we have become sloppy with memory use<br></i>><i> (the process holds on to data longer than it should need to; and the<br></i>><i> specifics of Lisp-internal memory management may be sub-optimal too).<br>
</i>><i> i am currently making changes liberally to the LOGON `trunk', where i<br></i>><i> would suggest you stick to the HandOn release version until everything<br></i>><i> has stabilized again (hopefully sometime next week, or so).<br>
</i>><i><br></i>><i> all best - oe<br></i></pre>
</blockquote></div><br>