[developers] Valid MRS? Bug in ERG?

goodman.m.w at gmail.com goodman.m.w at gmail.com
Thu Sep 10 09:17:28 CEST 2020


Thanks for the clarification, Stephan. I've noted the suggestion for
backing off on TOP to INDEX and for allowing no top. This makes sense.

I'm completely unable to make sense of the lisp format call, so I'm not
sure what you mean regarding conclusion (b), but I'll wait for your post to
the other thread.

On Thu, Sep 10, 2020 at 2:45 PM Stephan Oepen <oe at ifi.uio.no> wrote:

> g'day:
>
> > I think the LKB's EDS code will more aggressively search for a top for
> the EDS graph during conversion, perhaps looking to the INDEX. If anyone
> (Stephan?) cares to explain the procedure for selecting tops in
> less-than-perfect MRSs, I'd be happy to try and implement it in PyDelphin.
>
> yes, robustness to unusual or illformed (as in this case) MRSs has
> long been a key goal in the EDS conversion (in the LKB); MRS
> infelicities (in ERG parses) were probably more common in 2002 than
> today, but still i think that conversion should preferably never fail,
> i.e. possibly rather drop information from an illformed MRS than not
> yield an EDS at all.
>
> regarding the top node, i do indeed fall back to the INDEX, if need be:
>
>   (let* ((ltop (ed-find-representative eds (psoa-top-h psoa)))
>          (index (ed-find-representative eds (psoa-index psoa))))
>     (setf (eds-top eds)
>       (or (and (ed-p ltop) (ed-id ltop))
>           (and (ed-p index) (ed-id index))
>           (and (var-p (psoa-index psoa))
>                (var-string (psoa-index psoa))))))
>
> the third clause in the or() appears intended to deal with an MRS
> whose INDEX is not the intrinsic variable of any EP.  in that case,
> the EDS will end up with a top that is not the identifier of any of
> its nodes, so effectively no top.
>
> thinking about such corner cases just now, i am tempted to drop that
> third fall-back clause and leave the top empty (which would be
> formally equivalent, seeing as the top property is interpreted as an
> annotation on one of the actual graph nodes).  it appears native
> serialization allows for empty top nodes already, in which case there
> will be nothing following the opening brace on the first line:
>
>   (format
>    stream
>    "{~@[~(~a~):~]~
>     ~:[~3*~; (~@[cyclic~*~]~@[ ~*~]~@[fragmented~*~])~]~@[~%~]"
>    (eds-top object)
>    (and *eds-show-status-p* (or cyclicp fragmentedp) )
>    cyclicp (and cyclicp fragmentedp) fragmentedp
>    (eds-relations object))
>
> while i am sure we have never hit empty tops while working with MRSs
> produced by the ERG, the above suggests that (a) identification of the
> top node is optional in EDS and (b) native serialization was intended
> as a line-oriented format.
>
> mike, may i suggest you add the fall-back, looking for the INDEX, and
> otherwise allow EDSs whose top is empty.  regarding the exact
> definition of the native EDS serialization, i shall return to that
> question in the original thread we had on the topic (one might
> disallow whitespace between the opening brace and the optional top, to
> try and evade conclusion (b) above).
>
> cheers, oe
>


-- 
-Michael Wayne Goodman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20200910/9a1f8377/attachment-0001.html>


More information about the developers mailing list