Hi Stephan and all,<br><br>I do find the changes appropriate :-) thanks for the work. It is true that the forest creation is relatively inexpensive. However, Valia and I are still a little concerned about the potential efficiency loss on the German Grammar. Berthold, could you estimate how large the efficiency loss will be? Is an extra option necessary?
<br><br>Theoretically, this might lead to the discussion of necessity of selective (k-best) forest creation. But an extra option for the forest creation will be an easy (though no-optimal) solution.<br><br>Another use of such an option I can think of is in the coverage test, where only the parsability of the sentence is interested. In such cases, the creation of the entire parse forest does not seem necessary.
<br><br>Stephan, Berthold and Bernd, what do you think?<br><br>Best,<br>yi<br><br><div><span class="gmail_quote">On 11/9/06, <b class="gmail_sendername">Stephan Oepen</b> <<a href="mailto:oe@csli.stanford.edu">oe@csli.stanford.edu
</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">hi again,<br><br>> I also think the use of `-nsolutions' is particularly vague at the
<br>> moment. I believe this is partly due to the split of the parsing<br>> phases. To PET developers, should the option be splitted for<br>> particular phases of parsing?<br><br>i had to check the code to convince me the above was true :-). i think
<br>in packing mode, `-nsolutions' should only affect the second phase, and<br>we should always compute the full forest. i was so sure of this point<br>of view that i just checked in the code changes to make it so. here is
<br>what i put into the ChangeLog:<br><br> - ignore nsolutions limit in forest construction phase when packing<br> is on; the rationale here is that (a) forest construction is cheap<br> and (b) we need to have the full forest available for selective
<br> unpacking to compute the correct sequence of n-best results.<br><br>in fact, what i say about selective unpacking here is equally true for<br>the exhaustive unpacking mode (which should soon be deprecated, as it<br>
remains restricted to local features). while i write this, i realize<br>that forest construction may be more expensive in GG, hence my change<br>might cause berthold a loss in efficiency? a small price for greater<br>precision, i would hope! berthold, if not, i volunteer to add another
<br>switch, just as zhang yi had suggested.<br><br>while making this change, i checked in a few more minor updates, viz:<br><br> - allow selective unpacking by default when `-packing' is on, i.e. it<br> is no longer required to say `-packing=15' (but still `-nsolutions'
<br> greater than 0 is needed to actually get selective unpacking);<br> - fix an error in the YY tokenizer to make it robust to tokens coming<br> in out of surface order;<br> - complete spring cleaning of identity2() along the lines of my email
<br> of 31-oct (bernd i could not test jxchg output, but i am optimistic<br> i did the right thing);<br> - make the MEM reader robust to various value formats in the global<br> parameter section;<br> - ditch the (deprecated) *maxent-grandparenting* parameter; its name
<br> is canonically *feature-grandparenting*, and [incr tsdb()] will use<br> that name in generating MEM files.<br><br>zhang yi and bernd, i hope you will all of the above agreeable!<br><br> best - oe
<br><br>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br>+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125<br>+++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
<br>+++ --- <a href="mailto:oe@csli.stanford.edu">oe@csli.stanford.edu</a>; <a href="mailto:oe@ifi.uio.no">oe@ifi.uio.no</a>; <a href="mailto:stephan@oepen.net">stephan@oepen.net</a> ---<br>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
<br></blockquote></div><br>