[developers] Distinguishing between orthographemic and non-orthographemic rules

Fri Jul 1 00:17:16 CEST 2016

Hi developers,

Starting a version or two ago, ACE has taken it upon itself to keep the strings under STEM (or whatever the orth-path configuration points to) up to date during morphological changes.  Mike recently noticed that these changes have caused Jacy to not work with ACE for certain inputs.  In my view, there are fundamentally two different types of (lexical) rules in our universe:

- rules for which the processor (ACE, LKB, PET, Agree, what-have-you) is responsible for determining the STEM value on the mother edge, as a function of the STEM value on the daughter edge.  That function is determined by %suffix and %prefix statements, as well as irregular form tables.

- rules for which the grammar is responsible for determining the STEM value of the mother edge.  This will generally amount to a reentrancy between the mother’s STEM and the daughter’s STEM, but in principle could be an explicit change.

In the case of Jacy, it seems that the rule "vbar-monotransitivization-c-lrule" neither declares orthographemic changes (by %suffix, %prefix, or entries in the irregulars table) nor declares a value for the mother’s STEM.  My questions for the community are: (1) are we in agreement that these two fundamental classes of lexical rules exist, as outlined above? and (2) if so, how should a processor decide which case a given rule falls into?  Currently ACE treats the Jacy rule in question as belonging to the latter class since no changes are declared.  This leads to something like [ STEM < *top* > ].  On application of subsequent lexical rules that actually do have orthographemic reflex, ACE then crashes (not ideal, but I want feedback about how to improve it :-)).

Thanks,
Woodley

n.b. I might be wrong about exactly which part of Jacy is underspecific, as I don’t have access to full debugging facilities right now, but I believe the above is an accurate summary of the situation.