[matrix] Matrix Coordination Module
Scott Drellishak
sfd at u.washington.edu
Mon May 2 12:11:53 CEST 2005
Hi, all. I'm working on a coordination module for the Matrix. I'd like to
describe my current plans and see if you have either any objections or any
additional suggestions.
My analysis for coordination is based pretty heavily on the ERG's -- in
particular, the coordination module will use binary branching structures to
simulate N-ary branching coordination, and there will be three abstract
rules from which all part-of-speech-specific rules derive:
(1) head-marker, which handles the marking of a node as coordinated
(2) mid-coord, which joins a node on the left with a node marked
coordinated
on the right, with the mother node marked coordinated.
(3) top-coord, which joins a node on the left with a node marked
coordinated
on the right, with the mother node *not* marked coordinated.
So we'll have the following structures for the following examples (using
X-top, X-mid, and X-head as abbreviations for the three rules above, and
ignoring punctuation):
(4) fire and steel
[NP-top [NP fire] [NP-head [CONJ and] [NP steel]]]
(5) beg, borrow, or steal
[V-top [V beg] [V-mid [V borrow] [V-head [CONJ or] [V steal]]]]
The new feature SYNSEM.LOCAL.COORD will be a list containing the
coordination relation specified down in the head-marker rule and all the
collected handles and indices of the coordinands. (This is called CONJ in
the ERG, but since some languages mark coordination without separate lexical
conjunctions, I'm avoiding using "CONJ" in the general case.)
By varying the definitions of top-coord and mid-coord, we can require
different strategies for marking coordination. If the left coordinand in
both rules must have a non-empty COORD, then we require coordination marking
(e.g. a conjunction) on each coordinand:
(6) and A and B and C and D
If the left coordinand must have an empty COORD, then we require
coordination marking only on the rightmost coordinand:
(7) A B C and D
Various other patterns are also possible.
Similarly, by varying the head-marker rule, we can get preposed conjunctions
(as in English), postposed conjunctions, or some kind of non-lexical marking
(i.e. a suffix, a different verb form, etc).
As described so far, the top-coord rule isn't doing much work -- more or
less everything could be handled by mid-coord. However, I intend eventually
to implement something equivalent to the ERG's LEFT feature, which is a
two-item list that contains the CONJ values for complex conjunctions like
"either...or", where "either" is applied by the top rule, and any number of
"or"s are applied by repeated application of the mid-rule. (As with CONJ
above, I think I'm going to rename this feature, perhaps to COORD-STRAT.)
I'm in the middle of the first-pass implementation of this module, which
will include the abstract rules and also phrase-type-specific rules for NP,
VP, AP, PP, and probably some others. Near the end of this month (May), a
group of students in Emily Bender's Grammar Engineering course who are
working on grammars of various languages will attempt to integrate my
implementation of coordination into their grammars. That should provide
some welcome testing.
After making any changes based on what I learn from that testing, I'll
figure out how my rules will be integrated into the module scripts. That
will involve structuring the questions asked (e.g. "Does your language
coordination all phrases with the same coordination strategy?"), and also
deciding on a file structure. I'm currently leaning towards one file with
all the abstract rules, and then one file for each specific factored rule
(e.g. (a) VP-coordination (b) with morphological marking (c) on each
coordinated verb, versus (a) NP-coordination (b) marked by conjunctions (c)
with only one conjunction per coordinated list).
If you have any suggestions, please let me know. Thanks in advance for your
time!
Scott Drellishak
"The Coordination Coordinator"
University of Washington Linguistics
sfd at u.washington.edu
More information about the matrix
mailing list