[developers] Many a cat sleeps.
sweaglesw at sweaglesw.org
Sat Mar 9 01:44:31 CET 2013
Hi Dan and Glenn, and developers,
ACE parses "Many a cat sleeps." without passing the [many_a_det] lexeme
through any lexical rules at all -- and why should it? You are of
course right that the irregular plural of "many" gives rise to an
hypothesis that the token "many" in this input might be a plural, but
fortunately that is not the only hypothesis; since "many a" appears in
the lexicon with no orthographemic changes, ACE happily allows the
corresponding lexeme into the chart with no orthographemic agenda. (If
there were orthographemic changes, regular or irregular, which operated
on the left periphery of "many" or the right periphery of "a", ACE would
also admit a version of the corresponding MWE lexeme into the chart with
that nonempty orthographemic agenda).
Perhaps when the LKB and AGREE find an entry in the irregulars table
that matches the surface form they are considering, they assume that it
is the only legal interpretation of that surface form? And it only
happens for MWE's? How odd!
It surely isn't correct to stop at the first matching irregular form,
e.g. "grown" can be V_PSP_OLR(grow) or V_PAS_ODLR(grow). I seem to
recall some cute examples where an irregular form under one rule
collides with a regular form under another rule too, though I don't seem
to be able to reconstruct any such data points just now.
On 03/08/2013 02:42 PM, Dan Flickinger wrote:
> The source of the trouble is that "many" is also listed in the irregs.tdl file as having the irregular plural "many", which is needed for the current analysis of sentences like "The many (who like chocolate) are happy." This is the same analysis as for "rich" in "The rich are sometimes happy." For this construction, the de-adjectival noun is listed in the lexicon as an uninflected noun, but one that requires application of the plural inflectional rule, so each of these de-adjectival nouns must also have a corresponding entry in the `irregs.tdl' file, to avoid having a spurious "-s" added by the inflectional rule. Now when the LKB (and apparently also agree) will try to construct a multi-word edge for "many a", the morphology reports that there is an inflectional rule to apply to "many" to make a plural (as specified in the irregs file), and this "To-Do" list is (I believe wrongly) also preserved in the resulting multi-word edge for "many a", meaning this edge is marked as doomed in the chart, since it seems to require application of the plural inflectional rule, which of course cannot apply.
More information about the developers