[developers] RESTful ERG parsing

Stephan Oepen oe at ifi.uio.no
Mon Apr 4 21:40:27 CEST 2016


hi again, mike (and all)!

> A more generic "variables" object seems like a good idea. Just a small
> thing: mixing variable properties and other data ("type") has the
> possibility for collisions (i.e. a variable property named "type"). We could
> either put the variable properties in a sub-object, or say something like
> "variable property keys are always uppercase, other keys are lower or mixed
> case".

the latter is indeed the generalization i am assuming:
grammar-specific identifiers (role names and variable properties,
which are conceptually case-insensitive) are serialized in upper-case,
structural properties (top, index, relations, and such) are in
lower-case.  thus, i figured we can avoid the separate embedding of
‘properties’ and re-gain some compactness.

> Do you mean that the absence of an argument value in the "variables" object
> indicates it's a constant value? That seems handy. I'm definitely in favor
> of reducing reliance on grammar-specific configuration files.

yes, that is exactly the way i imagine the distinction is available as
a first-class piece of information in the JSON rendering.  and i
figured you might be in favor of interpretability without
grammar-specific parameters :-).

> Also, I notice that "hcons" became "constraints" in the latest version,
> which means that ICONS go in the same list as HCONS (which is not
> necessarily problematic)?

yes, once the extra ‘variables’ object gave us a structure that is no
longer isomorphic (though, of course, equivalent) to the ‘simple’
serialization, i felt we did not need a structural distinction between
different types of constraints.  i assume they will be distinguished
by their argument labels, i.e. ‘high’ and ‘low’ vs. ‘left’ and
‘right’.

—but of course many of these minor (albeit, in my view, important)
design decisions we are currently debating are ultimately arbitrary.
my primary goal is consistency and readability, hence i shied away
from mixing abbreviations and full-length (structural) property names.
as i see things now, the choice largely should be between

(a) top, index, relations, label, predicate, arguments, constraints,
relation, variables, ...
(b) top, index, rels, lbl, pred, args, hcons, icons, rel, vars, ...

my current preference is for (a), but i would be happy to hear
additional opinions.  i would like to freeze this part of the RESTful
interface (for version 0.9, at least; the URL scheme i am proposing
includes protocol versioning and, hence, in principle also changing
things later on) before the end of this week.  hence, anyone who wants
to vote on (a) vs (b) above, or argue other fine points of JSON
serialization, please do so in the next couple of days!

nb: i just changed ‘roles’ to ‘arguments’, to avoid invoking the
over-loaded notion of semantic (or even thematic) roles.

> But the Requests package (http://requests.rtfd.org) seems well-liked. It's
> not in Python's standard library, but the documentation for urllib even
> suggests it. Regarding Python versions, I'd try to first support versions
> 3.3+, then 2.7 if it's not too much work.

for such a lightweight client (11 lines of code), i thought it was
tempting to avoid dependencies on third-party packages.  i think i
will standardize on 3.3 and upwards then, and if and when you get to
adding a client interface to pyDelphin, i will let you find out
whether and how to work around Unicode limitations in 2.x :-).

—once the basic (version 0.9) interface is stable, i will look into
HTTP header processing and error codes.

thanks again for all your feedback!  oe



More information about the developers mailing list