[developers] case distinctions in (R)MRS predicates and constants

Ann Copestake Ann.Copestake at cl.cam.ac.uk
Thu Sep 7 22:27:53 CEST 2006


> i would like to elicit a quick policy decision regarding (R)MRS names
> (yes, something as easy as that :-).  i believe it is meant for PREDs
> _and_ CARG values not to be case-sensitive.  the former, i expect, is
> non-controversial.  for the latter, someone might think that "foo" and
> "Foo" should be distinct constants (parameters).  personally, i see no
> practical value in making that distinction, but i have no preferences
> either way.  in a sense, less downcasing might be the right thing ...

We discussed this during Deep Thought and decided that all PREDs are
case insensitive (written as lower case) and that CARG values are case
sensitive.  I believe that Dan, for one, argued that he wanted this
distinction to be available.

> the current code seems not fully consistent in this respect.  either i
> would propose to change make-unknown-word-sense-unifications() (which
> is an ERG function, but i believe you wrote it) to not downcase; 

No, this is not inconsistent.  What happens at the point when one is
trying to build a lexical entry in the absence of a grammar writer
does not reflect on whether grammar writers should be allowed to
distinguish case in CARG!  Making this function preserve case is a bad
idea, even though CARG is significant, because for any practical
application, you'll get e.g. `LONDON' and `London' (and quite likely
`london', especially if parsing some people's email).  But, in any
event, that function was just a quick hack to get the QA stuff working
(though it seems to have lost the comment to that effect).  I believe
the assumption should still be that the LKB does not try and guess
unknown words as it is generally used (i.e., as a GDE).

> or i
> would volunteer to make matches-rel-record() ignore case mismatches on
> elements of *value-feats*.  which way should we go?  once and for all.

I'd prefer to stay as we are.  If a grammar writer does not want to
have case distinctions in CARG, they can make all the CARG values
lower case.

More generally, I have agreed in principle that case should be
preservable in the LKB if grammar writers wish, and though I haven't gone
through the code to get that working, it is something I still intend
to do.

Ann



More information about the developers mailing list