[developers] prefixation of multi word lexemes

Ann Copestake Ann.Copestake at cl.cam.ac.uk
Mon Aug 8 19:55:51 CEST 2005


whoops - somewhere in the course of the last message I forgot that the issue
was prefixation due to punctuation and therefore forgot about the fairly
annoying problem that has arisen (which I'd realised a couple of months ago but
don't think I've discussed).  

If we're talking about proper morphology, it is, I think, reasonable to say
that the `words with spaces' (WWS) mechanism should only apply to things with
at most one affixation site.  So one could do `kick the bucket' with inflection
on `kick' but if the idiom allowed for `kicked the buckets' (cf letting cats
out of bags) then the fact that there was dual inflection would be good
evidence that the idiom was decomposable/compositional (whatever terminology
you want to use) and that the WWS mechanism was unsuitable.

BUT, once we get punctuation involved as morphology, we can't assume this
anymore.  This is a little unpleasant.  For English we might get away with
saying there was a suffixation site on the last element of the WWS and a
prefixation site on the first (which would only be for punctuation at least
with the current ERG) but this won't work for French, where adjectives can be
postnominal and hence WWS can get inflected on the non-terminal element (cf
`surgeons general'). The words with spaces mechanism works by treating the
morphology as initially on the individual `subwords', hence we would have to
combine partial trees (i.e., the morphology associated with each subword),
interleaving them non-deterministically ...  Not nice.

If we have a suitably rich way of marking affixation position possibilities on
WWS, then we can presumably treat these things as different classes of rules
and avoid the non-determinism in practice by saying punctuation comes last.
But at this point, I think we have to use the FSs to indicate affixations sites
on WWSs rather than relying on global variables and fixed positions for
affixation in WWSs.

Ann




More information about the developers mailing list