[developers] [pet] unpacking

Tue Jul 24 00:07:02 CEST 2007

hei!

> Sorry to have contributed to the bewildering behavior.  Please try
> the new version of the grammar once I check it in.

after talking to dan more and searching my email archive (in lieu of a
functional memory), i believe we keep seeing the bug documented below.
the Oepen & Carroll (2000) algorithm and its current PET implementation
fail to prevent an edge from packing into one of its ancestors.  while
it may be the case that we tend to only hit that problem when there is
something not quite right in a grammar, in principle the configuration
i sketched below seems perfectly legitimate.  we should extend the test
for packability to reject host edges that are dominated by a new edge;
seeing that we only need to walk down the daughters chain as long as we
stay within the same start and end vertices, and that we can test edges
by identity, this should not add noticeable overhead.

looking at the code, i would suggest a test like tItem::containedp() at
the top of the for() iterating over candidate hosts in packed_edge(); i
could see myself write that, but not until late august or so.  bernd or
zhang yi, if you feel you have cycles to spare, i need not reserve this
bug fix for myself :-).

                                                       all best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

[--- end of forwarded message ---]

Date: Sat, 19 Mar 2005 17:55:56 -0800
From: oe at csli.stanford.edu (Stephan Oepen)
To: j.a.carroll at sussex.ac.uk
Cc: koji_tukamoto at yahoo.co.jp, dan at csli.stanford.edu
Cc: developers at delph-in.net
Subject: Re: unary rules and packing

hi again, john, and thanks for the quick reply!

i decided to copy `developers' and dan, in case there is an interesting
issue here.

> > today i managed to kill PET with an (apparently) unusual configuration
> > of packed edges, viz.
> >
> >   P [7 0-1 hoptcomp_rule (0) 1 {} {} {1}] < dtrs: 5  parents: >
> >   (7 hoptcomp 1.0000 0 1
> >     (5 hoptcomp 1.0000 0 1 {7}
> >       (1 generic_adj_compar/adj_intrans_compar_nale 0.0000 0 1 []
> >         ("firmer" 0.0000 0 1))))
> >
> > i.e. when i build # 7, using a unary rule over # 5, it gets packed into
> > its own daughter.  i believe nowhere in the published literature did we
> > discuss this case, but the consequences for unpacking are dire, viz. an
> > infinite loop (unpacking an edge requires unpacking its daughters).  i
> > cannot quite decide tonight, whether the above is a legal configuration
> > in the forest, and what its unpacking should be.  i am inclined to say
> > that our packed-edge-p() should check and block packing into daughters
> > of the current new edge.  that could be tested fairly efficiently, as i
> > would only have to walk down the daughters chain along the current span
> > (or bit-code characterizing semantic coverage), i think.
>
> Is the edge packing into its own daughter because information that has 
> been suppressed by the restrictor would have made this fail, or is it 
> simply because there's a cycle in the grammar? If the former, then 
> isn't there a problem with the restrictor? If the latter, then the 
> expected behaviour on unpacking is an infinite loop -- or if packing is 
> turned off, an infinite loop before during parsing. [...]
> 
> Or am I missing something?

for the current case, i now believe we are seeing a bug in the grammar:
its `adj_intrans_compar_nale' is not supplying a COMPS values, and then
the `hoptcomp' unary rule will cycle freely.  dan, what do you think?
after all, the generic lexical entries have hardly been used since the
YY days, for all i know.

however, i still think there is a general problem.  without assuming a
cycle in the grammar, how can we guarantee that a unary rule R applied
to some edge e_1 does not result in a structure subsumed by e_1?  the
following rule, i think, should actually do the trick:

  R := #1 & [ FOO + ] --> #1 & [ FOO - ]

assuming e_1 was unspecified for FOO, it should both feed the rule and
subsume the output structure.  or not?

                                                          best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (ILF); Boks 1102 Blindern; 0317 Oslo; (+47) 2285 7989
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at csli.stanford.edu; oe at hf.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

[--- end of forwarded message ---]