Any error messages in particular I should look out for in the log files. (I didn't see anything obvious like "Out of memory!")<br><br><div class="gmail_quote">On Fri, Feb 27, 2009 at 8:57 AM, Stephan Oepen <span dir="ltr"><<a href="mailto:oe@ifi.uio.no">oe@ifi.uio.no</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">hi again, bill,<br>
<div class="Ih2E3d"><br>
> I am running many different grid* files in parallel using the Condor<br>
> distributed computing system. On my latest round of jobs, all of my<br>
> Condor jobs completed, but when I look at the directories created<br>
> under ~/logon/lingo/redwoods/tsdb/home, I find that most of them have<br>
> empty score files. I take the empty score files to be a sign that<br>
> something didn't work.<br>
<br>
</div>yes, i would say they indicate a failed experiment. there are quite a<br>
few ways in which individual experiments can fail, without the complete<br>
job necessarily failing. Lisp may run out of memory at some point, but<br>
`recover' from that and carry on; for example, when reading the profile<br>
data to start an experiment, a fresh Lisp process will almost certainly<br>
need to grow substantially. with limited RAM and swap space, that may<br>
fail, but an `out of memory' error may be caught by the caller, where i<br>
can say for sure that [incr tsdb()] frequently catches errors, but i am<br>
less confident (off the top of my head) about how these will be handled<br>
in the context of feature caching and ME grid searches. our `approach'<br>
has typically been lazy: avoid errors of this kind, hence it is quite<br>
likely that they are not handled in a very meaningful way. experiments<br>
might end up being skipped, or even executed with incomplete data ...<br>
<br>
in a similar spirit, the parameter searches call tadm and evaluate many<br>
times, and either one could crash (insufficient memory or disk space in<br>
`/tmp'), and again i cannot really say how that would be handled. i am<br>
afraid, my best recommendation is to (a) inspect the log files created<br>
by the `load' script and (b) try to create an environment for such jobs<br>
where you are pretty confident you have some remaining headroom. from<br>
my experience, i would think that means a minimum of 16 gbytes in RAM,<br>
generous swap space (on top of RAM), and at least several gigabytes of<br>
disk space in `/tmp'.<br>
<br>
as regards your earlier (related) question about resource usage:<br>
<br>
> What was your memory high water mark during test (as opposed to<br>
> training)?<br>
<br>
memory consumption will depend on two parameters: the total number of<br>
results (i.e. distinct trees: `zcat result.gz | wc -l'), and how many<br>
feature templates are active (e.g. levels of grandparenting, n-grams,<br>
active edges, constituent weight). i have started to run experiments<br>
again myself, and i notice that we have become sloppy with memory use<br>
(the process holds on to data longer than it should need to; and the<br>
specifics of Lisp-internal memory management may be sub-optimal too).<br>
i am currently making changes liberally to the LOGON `trunk', where i<br>
would suggest you stick to the HandOn release version until everything<br>
has stabilized again (hopefully sometime next week, or so).<br>
<br>
all best - oe<br>
<br>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125<br>
+++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515<br>
+++ --- <a href="mailto:oe@ifi.uio.no">oe@ifi.uio.no</a>; <a href="mailto:oe@csli.stanford.edu">oe@csli.stanford.edu</a>; <a href="mailto:stephan@oepen.net">stephan@oepen.net</a> ---<br>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>
</blockquote></div><br><br clear="all"><br>-- <br>Bill McNeill<br><a href="http://staff.washington.edu/billmcn/index.shtml">http://staff.washington.edu/billmcn/index.shtml</a><br>Sent from: Seattle Washington United States.