<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8"> <META NAME="GENERATOR" CONTENT="GtkHTML/3.10.3"> </HEAD> <BODY> On Fri, 2006-11-10 at 12:06 +0100, Stephan Oepen wrote: <BLOCKQUOTE TYPE=CITE> <PRE> hi berthold, > it is really a large number of experiments. I started my first > experiment about a week ago, and I am not even into grandparenting yet. > Is there a way to speed things up, e.g. by dropping some less > interesting variation in parameters? Or is there any support for > multiprocessing? > > Some parameters are not really self-explanatory. Can you provide some > comments on grid.lisp? Which parameters are now supported in Pet? looking at your log file, all seems to proceed as it should :-). most of the time goes into parameter estimation, and there is little we can do about that (short of parallelizing experiments, which i would like to implement one day). </PRE> </BLOCKQUOTE> As a quick workaround: would it work to just start 4 lisp processes with 4 different configuration files, say one for each level of grandparenting? <BLOCKQUOTE TYPE=CITE> <PRE> for each experiment, you get 18 grids: :variance '(nil 1e4 1e2 1e-2 1e-4 1e-6) :relative-tolerance '(1e-6 1e-8 1e-10)) one grid takes between five minutes and one hour (for your 15,000 Eiche items), and by default each grid comprises two folds. those hour-long runs appear to be ones with either no prior (`variance') or a very low relative tolerance; they often diverge. </PRE> </BLOCKQUOTE> That's really useful to know. <BLOCKQUOTE TYPE=CITE> <PRE>   maybe you could trim down the TADM parameter variation, e.g. :variance '(1e4 1e2 1e-2 1e-4 1e-6) :relative-tolerance '(1e-6 1e-8)) assuming the default LOGON `grid.lisp, you should get 192 experiments :grandparenting '(0 2 3 4) :active-edges-p '(nil t) :lexicalization-p nil :constituent-weight '(1 2 0) </PRE> </BLOCKQUOTE> What does this feature do? <BLOCKQUOTE TYPE=CITE> <PRE> :ngram-size '(0 2 3 4) :ngram-back-off-p '(nil t) such that you are more than ten per cent done already :-). so maybe my defaults are overly generous with cpu days! if your main interest is a model to use with PET, you can cut out all variation but grandparenting and active edges (aka partial configurations). so maybe the following :grandparenting '(0 2 3 4) :active-edges-p '(nil t) :lexicalization-p nil :constituent-weight 0 :ngram-size 0 :ngram-back-off-p nil this would bring down the total to eight experiments, each of ten grids ... you will be done in no time! when talking to zhang yi recently, we (think we) worked out what would be needed for PET to also support those n-gram features (with selective unpacking that is; i personally believe it is not really worth adapting the non-selective universe for additional features). but before making the time to implement such extensions, we should know how much we gain on top of the basic configurational features plus grandparenting. from past experience, that could be relatively little. to know for sure, we would have to complete more of those experiments in the above ... but it might still make sense to narrow down estimation parameters first. </PRE> </BLOCKQUOTE> Sure. How can I inspect the result of the experiments? Do I have to process the log files, or can I also inspect the tsdb profiles? I would also like to do an error analysis. Is that possible with the virtual profile setup, or will I end up creating this one large profile? Another question: what happens with partially disambiguated and/or rejected parses. Is there a way to see how they contribute to the end result? Are they ignored? Finally, there are two measures of accuracy reported in the log file: eaccuracy and naccuracy. How doi they relate to each other? B <BLOCKQUOTE TYPE=CITE> <PRE> i hope this helps! cheers - oe +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125 +++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515 +++ --- <A HREF="mailto:oe@csli.stanford.edu">oe@csli.stanford.edu</A>; <A HREF="mailto:oe@ifi.uio.no">oe@ifi.uio.no</A>; <A HREF="mailto:stephan@oepen.net">stephan@oepen.net</A> --- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ </PRE> </BLOCKQUOTE> </BODY> </HTML>