[developers] RESTful ERG parsing

Michael Wayne Goodman goodmami at u.washington.edu
Tue Mar 29 01:00:58 CEST 2016


Hi Stephan,

On Mon, Mar 28, 2016 at 1:53 PM Stephan Oepen <oe at ifi.uio.no> wrote:

> dear colleagues,
>
> i used part of the easter break to teach myself about modern
> technologies and are currently in the process of providing a RESTful
> (programmatic) interface to the on-line ERG demonstrator.  i know of
> at least one colleague who has been waiting impatiently for this
> functionality :-).
>
> in a nutshell, client software can now obtain parses using the HTTP
> protocol and URIs providing the input string (and a handful of
> optional parameters).  for example:
>
>   http://erg.delph-in.net/rest/0.9/parse?input=Abrams%20arrived.
>
> parsing results will be returned in machine-readable format,
> serialized as a JSON document.  for a little more background on how to
> use this new service (including an example client in Python, believe
> it or not), please see:
>
>   http://moin.delph-in.net/ErgApi


What a beautiful bike shed :)

BTW, Demophin has an undocumented HTTP API, but it's not RESTful:

$ curl -F 'sentence=Abrams arrived.'
http://chimpanzee.ling.washington.edu/demophin/erg/parse

I had hoped to change it to follow the REST principles more closely and
document the API, but I'm happy to know that you've already started that
effort.


>
> there is some more work to be done on the interface (see the page
> above), but i would like to ask for help already at this point:
>
> (0) in case you notice anything surprising in the interactive ERG
> demonstrator, please do not hesitate to let me know!
>

If you're defining a new JSON schema for EDS, then maybe we can do
something more convenient for, e.g., lnk values. Currently the indices are
encoded in a string:

    "lnk": "<0:6>"

If we make it a JSON object, then users of the results wouldn't have to
parse the string later:

"lnk": {"type": "charspan", "cfrom": 0, "cto": 6}

(The "type" could be optional if we define "charspan" as the default, or if
we pretend that the other types don't exist)


> (1) i still need to provide a serialization of MRSs in JSON; in case
> anyone has previously tackled this (design) problem, please do get in
> touch!
>

Not yet, sorry.  One thing that comes to mind is that JSON doesn't have an
unordered collection aside from objects (hashes), which require keys. So we
could treat the RELS, HCONS, and ICONS bags as arrays (lists) (but we often
do this anyway, so I think it's fine to use arrays). Here's a rather direct
conversion:

{ "top": "h0", "index": "e2", "rels": [ {"pred": "proper_q", "lbl": "h3",
"arg0": "x4"...}...]...}

One thing that isn't obvious is variable properties. They could follow the
EDS example and put them in the EP object (and similarly be controlled by
the "properties" parameter in the URL):

{ ..., "rels": [ { "pred": "named", ..., "properties": { "PERS": "3"}}, ...
], ...}


> (2) i think it might be nice to incorporate RESTful parsing as an
> option in pyDelphin; mike, could you be interested in collaborating on
> this?
>

Yes. Whatever API we settle on, I'd like to incorporate that into pyDelphin
and use it as the basis for Demophin as well.


> finally, i would be curious to hear comments or suggestions for how to
> use and extend this service (though cannot promise i will have a lot
> of time to develop this further until another holiday break); please
> see towards the bottom of the above wiki page for some candidate
> directions.
>

I've done a couple of REST APIs so far, so I have some suggestions (some
are rather technical so I'm happy to save those for an off-list discussion).

One thing that might be relevant to others is how we can request other
formats. I see you have parameters "eds=...", "derivation=...", "mrs=...",
so presumably we could expand it with others ("rmrs=...", "dmrs=...", etc.)?

Other ideas:
* what about generation?
* can set a request header for preprocessing? (e.g. morphological
segmentation for Jacy or Zhong)
* If we have already preprocessed, can we specify the Content-Type (e.g.
Content-Type: application/yy)

best wishes; god påske!  oe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20160328/1a8986a9/attachment.html>


More information about the developers mailing list