[sdp] Call for Participation: Cross-Framework Meaning Representation Parsing at CoNLL 2020

Tue Mar 31 14:13:17 CEST 2020

[with apologies for cross-posting]

Despite much uncertainty about the in-person conference season this
year, we are excited to invite participants to the Shared Task at the
2020 Conference on Computational Natural Language Learning (CoNLL):

  Cross-Framework Meaning Representation Parsing (MRP 2020)

For background on the nature of the task and its schedule, please see:

  http://mrp.nlpl.eu

Any potentially interested parties, please sign up for future updates:

  http://lists.nlpl.eu/mailman/listinfo/mrp-users

A sample of sentences annotated with MRP graphs in five frameworks:

  http://svn.nlpl.eu/mrp/2019/public/sample.tgz

OBJECTIVES

The goal of the task is to advance data-driven parsing into
graph-structured representations of sentence meaning.  All things
semantic are receiving heightened attention in recent years.  And
despite remarkable advances in vector-based (continuous and
distributed) encodings of meaning, ‘classic’ (discrete and
hierarchically structured) semantic representations will continue to
play an important role in ‘making sense’ of natural language.  While
parsing has long been dominated by tree-structured target
representations, there is now growing interest in general graphs as
more expressive and arguably more adequate target structures.

For the first time, this task combines formally and linguistically
different approaches to meaning representation in graph form in a
uniform training and evaluation setup.  Participants are invited to
develop parsing systems that support five distinct semantic graph
frameworks—which all encode core predicate–argument structure, among
other things—in the same implementation.  Training and evaluation data
will be provided for all five frameworks.  Participants are asked to
design and train a system that predicts sentence-level meaning
representations in all frameworks in parallel.  Architectures that
utilize complementary knowledge sources (e.g. via parameter sharing
and multi-task learning) are encouraged (though not required).
Learning from multiple flavors of meaning representation in tandem has
hardly been explored.

The task seeks to reduce framework-specific ‘balkanization’ in the
field of meaning representation parsing.  Expected outcomes include
(a) a unifying formal model over different semantic graph banks, (b)
uniform representations and framework-agnostic scoring, (c) systematic
contrastive evaluation across frameworks, and (d) increased
cross-fertilization via transfer and multi-task learning.  We hope to
engage the combined community of parser developers for
graph-structured output representations, including from six prior
framework-specific tasks at the Semantic Evaluation exercises between
2014 and 2019.  Owing to scarcity of semantic annotations across
frameworks, the shared task is organized into two tracks: (a)
cross-framework MRP, regrettably limited to English for the time
being, and (b) multi-lingual MRP, with one additional language for
each framework.

FRAMEWORKS

The task combines five frameworks for graph-based meaning
representation, each with its specific formal and linguistic
assumptions.

+ Prague Tectogrammatical Graphs (Hajič et al., 2012)
+ Elementary Dependency Structures (Oepen & Lønning, 2006)
+ Universal Conceptual Cognitive Annotation (Abend & Rappoport, 2013)
+ Abstract Meaning Representation (Banarescu et al., 2013)
+ Discourse Representation Graphs (Bos et al., 2017)

For the shared task, we have repackaged different graph banks into a
uniform and normalized abstract representation with a common
serialization format (in JSON).  Training data comprising semantic
graphs over a total of some 3.5 million tokens in running English text
is available to participants; additional, multi-lingual data will be
provided in mid-May.  High-quality tokenization, PoS tagging,
lemmatization, and Universal Dependency parse trees are provided as an
optional ‘companion’ resource.  For all frameworks, both in- and
out-of-domain evaluation data will be provided in the same unified
format.

SCHEDULE

+ March 30, 2020: Availability of Starting Data Package
+ April 27, 2020: Initial Release of 2020 Training Data
+ May 25, 2020: Data Updates; Additional Languages
+ June 8, 2020: Closing Date for Extra Data Nominations
+ July 20–August 3, 2020: Evaluation Period (Held-Out Data)
+ September 7, 2020: Submission of System Descriptions
+ November 11–12, 2020: Presentation of Results at CoNLL

EVALUATION

For each of the individual frameworks, there are common ways of
evaluating the quality of parser outputs in terms of graph similarity
to gold-standard target representations.  There is broad similarity
between the framework-specific evaluation metrics used to date,
although there are some subtle differences too.  In a nutshell,
meaning representation parsing is commonly evaluated in terms of a
graph similarity F1 score at the level of individual node–edge–node
triples, i.e. ‘atomic’ dependencies.

For the shared task, we provide a (relatively straightforward)
generalization of existing, framework-specific metrics that is (a)
applicable across different flavors of semantic graphs, (b)
distinguishes separate ‘types’ of information, (c) does not require
matching node anchoring in the underlying string, but (d) takes
advantage of node ordering when available.  Labeled per-dependency
scores, macro-averaged across all frameworks, will be the official
metric for the task; but we will also provide additional
cross-framework evaluation perspectives, as well as scoring in
established framework-specific metrics.

INVOLVEMENT

We invite all possibly interested parties to self-subscribe to the
mailing list for this task; the subscription link and access
information for the training data are available from the task web
site:

  http://mrp.nlpl.eu

Please do not hesitate to contact the task organizers for questions or
clarifications, using the joint email address provided on the task web
pages.  And stay safe and healthy!

Omri Abend, Lasha Abzianidze, Johan Bos, Jan Hajič,
Daniel Hershcovich, Marco Kuhlmann, Bin Li,
Stephan Oepen, Tim O'Gorman, and Nianwen Xue