[developers] predicate naming in MRS
Michael Wayne Goodman
goodmami at u.washington.edu
Mon Dec 28 21:53:50 CET 2015
Hi Stephan, developers,
As you've noted, we've discussed this in previous summits (
email threads (e.g.,
http://lists.delph-in.net/archives/developers/2007/000804.html) . It would
be nice to make our different codebases more consistent.
>From the pyDelphin side, I try, as much as possible, to retain the original
form of preds, but comparisons normalize the values. Thus, if pyDelphin
reads in "_Cat_n_1_rel" it will output "_Cat_n_1_rel", but a comparision
with "_cat_n_1_rel" should return True (see below). This is because
pyDelphin relies on external tools like the LKB and ACE, and therefore it
tries hard to speak their respective MRS dialects. For example, StringPreds
(both single- and double-quoted) and RealPreds are equivalent:
>>> from delphin.mrs import Pred
>>> Pred.stringpred('_cat_n_1_rel') == Pred.realpred(lemma='cat', pos='n',
>>> Pred.stringpred('"_cat_n_1_rel"') == Pred.stringpred('_cat_n_1_rel')
>>> Pred.stringpred('"_cat_n_1_rel"') == Pred.stringpred('\'_cat_n_1_rel')
>>> Pred.stringpred('_cat_n_1_rel') == '"_cat_n_1_rel"' # compare to
regular strings, for convenience
However, Preds in the current version of pyDelphin are case-sensitive:
>>> Pred.stringpred('_cat_n_1_rel') == Pred.stringpred('_Cat_n_1_rel')
I've filed a bug for this (https://github.com/delph-in/pydelphin/issues/45)
and implemented a fix. I'll hold off checking in the fix until we've
reached consensus in this thread.
(Also, in investigating this, I found that pyDelphin was not robustly
dealing with other case differences:
Perhaps you can confirm the following:
* Predicates (string or otherwise) are always case-insensitive
* Other strings (CARG and surface values) are case-sensitive
* Everything else in SimpleMRS (rargnames, variables, variable property
names and values, HCONS/ICONS relations, things like "LTOP" and "RELS",
etc.) are case-insensitive
* The XML format of MRS may have case-sensitive things (tag names, etc),
following the rules for XML
Thanks for bringing this up. It helped me find some issues with pyDelphin
that I thought I'd already addressed.
On Mon, Dec 28, 2015 at 9:17 AM Stephan Oepen <oe at ifi.uio.no> wrote:
> dear colleagues,
> in these days of reflection, i would like to ask for opinions on an
> aspect of the DELPH-IN informalism where dan and i recently discovered
> that we held conflicting opinions. thus, we are looking for folks
> with a deeper understanding of the issue.
> for MRS predicate symbols, we have long established that we do not
> want case differences or type vs. string distinctions to be
> meaningful, i.e. we do not expect foo, Foo, or "foo" to name different
> relations (see ‘MrsRfc’ on the wiki). from this, i had concluded that
> no grammar would ever use both foo and "foo", whereas dan has found it
> convenient in the ERG to use comparable type names and strings (across
> lexical entries of different syntactic categories) in the expectation
> that they would be treated as equivalent MRS predicate symbols, e.g.
> _downtown_a_1_rel and "_downtown_a_1_rel".
> my assertation that the above was an undesirable property for a
> DELPH-IN grammar is supported by currrent software: MRS comparison,
> transfer, and generation do not treat types and strings as equivalent;
> a creator of input semantics for generation, for example, needs to
> know about the distinction and make a choice.
> what dan beliefs, however, arguably makes good sense (to me at least).
> i believe i can see how the various pieces of MRS manipulation
> software could be extended to yield the interpretation of equivalence.
> i would volunteer to make these changes in the Lisp implementation of
> MRS-related code.
> before suggesting a course of forward action, i would like to ask (a)
> whether anyone has strongly held positions (and supporting arguments)
> on the general question and (b) whether woodley and mike would be
> prepared to make software changes in ACE and pyDelphin, respectively,
> regarding this choice?
> with thanks in advance, oe
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the developers