[developers] enhanced SEM-I support in the forthcoming 1214 release of the ERG
Stephan Oepen
oe at ifi.uio.no
Fri Apr 29 15:21:31 CEST 2016
dear colleagues,
woodley and i are currently working to implement a decade-old vision
in ACE and the LKB, respectively: MRS manipulation in terms of the
explicit SEM-I specification (rather than against the grammar-internal
hierarchy). we have had the SEM-I declare the inventory of most of
the predicates and their synopses (sets of applicable argument labels
and associated value constraints, if any) since around 2005. but what
has so far been missing was a specified inventory of available
predicate abstractions and the hierarchical relations holding among
them. also, it turns out, there were missing some missing predicates
that originate from specialization obtained during unification, e.g.
the directional vs. stative vs. temporal sub-division of prepositional
relations.
in the forthcoming 1214 release of the ERG, the SEM-I has been
extended with more thorough declarations of
(0) the range of variable types, e.g. ‘i’ (for individuals), ‘e’ (for
eventualities) and ‘x’ (for instances);
(1) the inventory of variable properties and associated values, e.g.
‘PERS’(on) and ‘1’, ‘2’, and ‘3’;
(2) a partial hierarchy of predicates and complete lists of legitimate
surface and abstract predicates.
i am writing to alert you to these changes and to maybe evoke some
last-minute feedback on the imminent 1214 release. you can inspect
the new SEM-I files in the ‘etc/’ directory of the 1214 ERG (which is
mapped to the ‘lingo/erg/’ directory of the LOGON tree) or straight
from SVN:
http://svn.delph-in.net/erg/tags/1214/etc
as before, ‘erg.smi’ is the top-level entry point containing (0) and
(1) from the above list. a small (manually maintained) part of the
predicate hierarchy is in that file too, but the bulk of (2) above is
distributed over ‘hierarchy.smi’, ‘abstract.smi’, and ‘surface.smi’.
the current SEM-I exposes 121 abstract predicates, which dan, emily,
and i hope to document (to at least some degree) as part of our
ongoing ‘ErgSemantics’ efforts (see the corresponding wiki pages).
this is in contrast to some 500 abstract, non-glb types in the
grammar-internal predicate hierarchy, i.e. the SEM-I is masking a
large number of distinctions that the ERG makes internally.
so, in case you have used ERG predicate abstractions in the past (e.g.
in MT work) that you feel are motivated and should be preserved for
future generations, we would be grateful if you could try to identify
any such predicates and look for them in the draft SEM-I for the 1214
release. with a much more tighter SEM-I now, i think it will be
tempting to turn on pre-generation testing for SEM-I compliance at
some point soon. but we do expect that there will have to be more
fine-tuning of SEM-I contents over time.
to give you a better feel of what SEM-I abstractions can do for you,
we have collected a handful of test cases in a new ‘mrs/’
sub-directory of the ERG sources (in its forthcoming 1214 version),
together with corresponding generator outputs from the LKB (woodley
and i hope to compare between ACE and the LKB in the next few days):
0 oe at mv (~/src/logon/lingo/erg) 156 $ ls -l mrs/
total 36
-rw-r--r--. 1 oe oe 0 Apr 28 23:54 can_able.lkb
-rw-r--r--. 1 oe oe 722 Apr 28 23:43 can_able.mrs
-rw-r--r--. 1 oe oe 61 Apr 28 23:54 _in_p.lkb
-rw-r--r--. 1 oe oe 898 Apr 28 23:35 _in_p.mrs
-rw-r--r--. 1 oe oe 4028 Apr 28 23:54 nn.lkb
-rw-r--r--. 1 oe oe 917 Apr 28 23:31 nn.mrs
-rw-r--r--. 1 oe oe 130 Apr 28 23:54 temp_loc_sp.lkb
-rw-r--r--. 1 oe oe 904 Apr 28 23:34 temp_loc_sp.mrs
-rw-r--r--. 1 oe oe 136 Apr 28 23:54 universal_q.lkb
-rw-r--r--. 1 oe oe 495 Apr 28 23:36 universal_q.mrs
note that we take advantage of this new separation of church and state
(i.e. the SEM-I vs. grammar-internal hierarchy, or maybe the other way
around) to not only (a) hide many of the grammar-internal predicate
distinctions, but also to (b) introduce additional abstractions not
present in the grammar, and to (c) add some parent–child links between
predicates that are ‘missing’ in the grammar. an example of (b) is
the sub-division of quantifiers into broad classes of existentials vs.
universals and the (playful) addition of an abstraction over different
ways of realizing an underspecified relation between two nominals,
e.g. (from ‘nn.mrs’):
A jungle lion arrived.
A jungle's lion arrived.
A lion of a jungle arrived.
an example of (c), finally, is making the temporal senses of
prepositions like ‘at’, ‘in’, and ‘on’ specializations of the
‘general’ predicates associated with these prepositions, e.g.
_in_p_temp < temp_loc_sp & _in_p.
—as regards software support, the following operations on MRSs can now
optionally be keyed off the SEM-I rather than off the grammar-internal
hierarchies:
+ initialization of the generator chart;
+ MRS comparison, including post-generation;
+ matching transfer rules to MRS fragments.
in the LKB, there is a new global parameter *normalize-predicates-p*
to enable SEM-I support for the above operations. for the time being
this mode is off by default (in the LKB) but will be enabled by
default in the 1214 release of the ERG. as of today, the
*normalize-predicates-p* switch is only available for ongoing testing
in the LOGON copy of the LKB, but i plan on pushing these changes
upstream soon. as a side-effect of MRS manipulation against the
SEM-I, some of the corner cases we have discussed previously
disappear: predicates are case-insensitive, the ‘_rel’ suffix is
always stripped off, and there is no distinction made between (quoted)
‘strings’ and (unquoted) ‘types’.
best wishes, oe
More information about the developers
mailing list