[developers] generation bug in Agree - "are not permitted to look you."

Sat Nov 30 11:15:53 CET 2013

Perhaps we should not have two "_look_v_1_rel"'s with different
argument signatures in the first place?  I though that this was
officially discouraged, ...

On Sat, Nov 30, 2013 at 6:37 AM, Spencer Rarrick
<spencer.rarrick at gmail.com> wrote:
> Glenn and I have identified a bug in generation in Agree that seems to
> result from of combination of unusual circumstances. When generating from a
> parse of "You are not permitted to look.", we generate a large number of
> erroneous sentences similar to "are not permitted to look you.", "'re not
> permitted to look you", etc.
>
> Steps we have identified that allow these realizations include:
>
> 1. The input mrs contains an EP with "_look_v_1_rel" which has arguments
> LBL, ARG0. ARG1.  ARG1 is coreferenced to HOOK.XARG. The "correct" LE to
> look up would be "look_v1" which has an EP with that same PRED and argument
> signature. However there is also an LE "look_v4" that has an EP with
> "_look_v_1_rel", and an additional ARG2 argument position. Despite the extra
> argument position, however, the EP successfully unifies with the EP in the
> MRS so we add this LE to the chart. LBL, ARG0, and ARG1 end up skolemized
> because there were skolems in those positions in the input MRS EP, but ARG2
> remains unskolemized because the input MRS contained no information with
> which to specialize it.
>
> 2. The chart will have an edge for "you" added because of a "pron_rel" and
> "pronoun_q_rel" with "ARG0 [ PNG [ PN 2 ] ]."  This ends up skolemized to
> the same skolem constant as HOOK.XARG and ARG1 of the "_look_v_1_rel" EP.
> Generally, this skolemization should make sure that it can only be used
> where appropriate/intended, but because ARG2 on the EP in "look_v4" is not
> skolemized, it can combine with an edge with any skolem. In fact, we see the
> edge for "you" combine with "look_v4" to form "look you", which ultimately
> appears in several root realizations.
>
> 3. If accept fragment root symbols, we get the aforementioned VP-fragment
> root realizations such as "are not permitted to look you."  Each EP is
> accounted for, as the EP's that should have been used in the subject of the
> sentence are instead used in an argument of "look."  Strangely, we end up
> with two variables (sets of equivalence classes) that have the same skolem.
> ARG1 of "look", ARG2 of "permit", ARG2 or "parg_d_rel", and HOOK.XARG share
> the same variable which is not ARG0 of any EP, while ARG0 of pron_rel and
> pronoun_q_rel have the same skolem but are not coreferenced (except to each
> other).
>
>
> Clearly we are missing a constraint in one or more parts of our generation
> pipeline. There are fixes we have thought of, but we are not sure if they
> would have unintended consequences and possible block some valid
> realizations in other circumstances:
>
> 1. Final subsumption check.  We are not currently performing this, and it
> should in principle rule out these realizations as there are coreferences in
> the input MRS that are not present in the MRS of these generated trees.
> However, ideally we would like to avoid generating all of these edges rather
> than simply rejecting the derivation at such a late step.
>
> 2. During PRED lookup we could require that the number and names of
> arguments in candidate LEs matches exactly with those in the input MRS EPs.
> In this case, we would not even add "look_v4" to the chart in the first
> place. However, could there be cases where this sort of underspecification
> in number of arguments be valid and meaningful?
>
> 3. We could generate and assign skolems for argument positions not
> constrained by the input MRS (e.g. ARG2 on "look") This would prevent that
> edge from combining with "you" as their skolems would not unify. This seems
> potentially safer than what is suggested in (2), as it does not completely
> rule out the use of such LEs, but merely prevents them from using edges that
> should be reserved for other parts of the tree. I can imagine some control
> structure rule that might want to legitimately coreference that argument to
> some other part of the tree, and this would be disallowed by such
> skolemization, but perhaps this simply doesn't occur?
>
>
> Anyway, thanks to anyone who has made it through this incredibly long-winded
> email. If you have ideas about what the correct way to fix this bug,
> suggestions would be greatly appreciated.
>
> Spencer
>

-- 
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University