[developers] Getting nil when reconstructing edges from derivations

Stephan Oepen oe at ifi.uio.no
Sun Jul 21 22:29:12 CEST 2013


thanks for working this one out, dan!  in the interest of getting the
1212 release of the ERG out of the door, i think i should just revert
[incr tsdb()] reconstruction to how it used to be, i.e. not enable the
test against the start symbol in the default configuration for now.
this way, the ERG gold treebanks can remain unchanged, and it
will be possible to export everything using the default settings in
[incr tsdb()] versions of the time of the release.  sometime in a
few months time, i will then change the default behavior, i think,
to actually enable the test in the out-of-the-box configuration.

best, oe


On Sat, Jul 20, 2013 at 1:48 AM, Dan Flickinger <danf at stanford.edu> wrote:
> While Stephan's hypothesis was tempting, it turns out to be wrong for the five items that Ned identified.  All five items include the expression |take X into account|, treated in the grammar as the verb |take| selecting for an NP and a detless-PP, where the collocation of verbs and detless-PPs is constrained by an idiom rule (in idioms.tdl).  Unfortunately the 1212 version has an error in the idioms.tdl file exactly for the rule that would admit |take X into account|, and since PET (sadly) never applies these transfer-rule-type idiom constraints, it happily recorded the desired derivation in the treebank, even though the LKB was unable to reconstruct it due to the idiom rule bug.
>
> I have fixed this bug in the trunk version of the grammar, but for 1212, perhaps we should just mark these five items as rejected, or even just leave things as they are in the interest of getting the 1212 release declared to be fully frozen.  Counsel, Stephan?
>
>  Dan
>
> ----- Original Message -----
> From: "Stephan Oepen" <oe at ifi.uio.no>
> To: "Ned Letcher" <nletcher at gmail.com>
> Cc: "developers" <developers at delph-in.net>
> Sent: Thursday, July 18, 2013 12:28:49 PM
> Subject: Re: [developers] Getting nil when reconstructing edges from    derivations
>
> ah, these are potentially rather interesting. until a few weeks ago, (a) PET was not checking for cycles in testing against start symbols (‘sponsors’ in [incr tsdb()] jargon) and (b) [incr tsdb()] was not testing against start symbols during reconstruction. thus, it is tempting to suspect that these derivations in the gold treebanks would no longer be abailable (with these start symbols) from a current PET (or any other compliant DELPH-IN parser-generator). dan and i should look at these more and decide on the impact for the imminent 1212 release.
>
>
> to work around this problem, ned, i recommend you look for the right [incr tsdb()] variable near the top of ‘redwoods.lisp’ and set it to nil, to suppress testing of start symbols during reconstruction (i.e. what used to be the [incr tsdb()] default bahavior until recently).
>
>
> thanks for alerting us to this issue! oe
>
>
> On Wednesday, July 17, 2013, Ned Letcher wrote:
>
>
>
> Well that makes sense about different versions of the grammar. I had just been assuming I was using the same version without really thinking about whether this was true.
>
>
> I am indeed using the up to date logon. There were also a couple of other problematic items from wescience once I finished process it. Here they all are along with the resultant diagnostic message from [incr tsdb()] after attempting interacting reconstruction.
>
>
>
> 10820520 ws13 incompatible sponsor `ROOT_STRICT'.
> 10400970 ws07 incompatible sponsor `ROOT_STRICT'.
> 10090110 ws02 incompatible sponsor `ROOT_STRICT'.
> 10052180 ws02 incompatible sponsor `ROOT_STRICT'.
> 10012800 ws01 incompatible sponsor `ROOT_INFORMAL'.
>
>
> Ned
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Wed, Jul 17, 2013 at 1:08 AM, Stephan Oepen < oe at ifi.uio.no > wrote:
>
>
> hi ned,
>
>
> your code looks plausible, and i too would think all derivations in 1212 treebanks should reconstruct successfully (when using the 1212 ERG). emily rightly cautions this will typically not be the case for the ERG trunk: the treebanks are not maintained up-to-date for each incremental revision in the grammar.
>
>
> which leaves us to worry about that one WS01 item. are you sure your LOGON tree is up-to-date (1212 has not been formally released, and there were a few minor updates after tagging the initial release candidate)? if so, could you ‘Browse|Results’ on WS01, then double-click on the red 1 in the derivations column of the problematic item, then double-click on the (red) derivation. this will trigger interactive reconstruction, and assuming the derivation really is problematic, there will be diagnostic message in the *common-lisp* buffer. what does it say?
>
>
> best, oe
>
>
>
>
>
> On Wednesday, July 17, 2013, Ned Letcher wrote:
>
>
>
> Hi all,
>
>
> I've run into some strange behaviour which I'm hoping someone can shed some light on. I've got some lisp code I'm using to reconstruct the spanning edge of the first reading for items in profiles:
>
>
>
> (defun get-item (i-id profile)
> (first (tsdb::analyze profile
> :condition (format nil "i-id == ~a" i-id)
> :thorough '(:derivation))))
> (defun get-item-edge (item)
> (tsdb::reconstruct
> (tsdb::get-field :derivation
> (first (tsdb::get-field :results item)))))
>
>
> The problem is that get-item-edge is returning nil for some items even though they have derivations. For instance I get a nil result for item 10012800 in ws01 using erg 1212 but all the other parsing items from this profile are fine:
>
>
> (read-script-file-aux "~/logon/lingo/erg/lkb/script")
>
> (get-item-edge (get-item "10012800" "/gold/erg/ws01"))
>
> (get-item-edge (get-item "10012820" "/gold/erg/ws01"))
>
>
>
> I'm also getting many more of these nil results when running using trunk erg compared to 1212. When restricted to items with readings, running over ws01, I get only 1 of these nil results for 1212 and 73 for trunk erg. csli yields none with 1212 and 1 of these with trunk erg.
>
>
> Ned
>
>
>
>
> --
> nedned.net
>
> --
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125
> +++ --- oe at ifi.uio.no ; stephan at oepen.net ; http://www.emmtee.net/oe/ ---
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> --
> nedned.net



More information about the developers mailing list