[developers] processing of lexical rules

Ann Copestake Ann.Copestake at cl.cam.ac.uk
Tue May 3 14:58:26 CEST 2005


The changes to irregs are actually something that needs a bit of thinking 
through because it affects how one thinks of their status formally.  We could 
view irregular morphology as just a funny form of spelling rule that we keep 
in the irregs.tab file for convenience (and possibly we construct this from 
the lexical db, but again as a matter of convenience).  In this case, we are 
assuming that any _string_ with this spelling will have this affixation 
effect. Right now, the assumption is that any _stem_ will have that 
affixation, which is a bit different (it is sort of intermediate with the 
approach above and the view that irregular spellings are part of the lexical 
specification for a lexeme).

Assuming that irregularity is associated with all lexemes with a particular 
orthography (as opposed to individual lexemes) is roughly right for English 
(hang/hanged/hung being an exception) but what about other languages?  Do we 
need to support a version where it is associated with individual lexemes?
If I change the current code as suggested above, will that cause any immediate 
problems?  (I can't see anything that will go wrong for the ERG though I think 
it won't actually help - e.g., it won't capture `undo'/`undid' if we assume 
that the bracketing is ((un do) past).)

Ann





More information about the developers mailing list