[developers] Regarding to German sentence analysis.
Michael Wayne Goodman
goodmami at uw.edu
Wed Feb 22 19:17:11 CET 2017
Thanks, Ann. Francis said (offline) something similar about those being
grammar-internal properties. I've updated the RmrsVpm wiki with an example
of how to remove them via a VPM (
http://moin.delph-in.net/RmrsVpm#Corner_Cases). Is there any better place
to document this information?
And I also thought it might affect MRS's XML format as well, but then I
remembered that property names are stored as element text instead of
attribute names, so they don't suffer the same problem:
...
<extrapair><path>--PSV</path><value>non-apsv</value></extrapair>
...
It would affect the RMRX format, though.
On Feb 22, 2017 04:42, "Ann Copestake" <aac10 at cl.cam.ac.uk> wrote:
> from memory, the interpretation of the "--" was intended to be that this
> was something that should not appear in the external MRS
>
> in any case, it wouldn't be a DMRS issue, as such, since presumably it
> could apply to any of the MRS XML formats
>
> All best,
>
> Ann
>
> On 21/02/17 20:21, Michael Wayne Goodman wrote:
>
> Ann:
> Does the LKB do anything special regarding properties like --PSV when
> reading/writing DMRX? They aren't ill-formed in the SimpleMRS format, so
> I'm wondering if PyDelphin should attempt to do anything special for these.
>
> Megha:
> I just thought of another alternative. You can serialize to the
> DMRS-JSON format instead of DMRX. JSON doesn't have the same attribute name
> constraints as XML. The process is slightly different. Here is
> MRS->DMRS-JSON conversion:
>
> import json
> from delphin.mrs import simplemrs
> from delphin.mrs.xmrs import Dmrs
> print(
> json.dumps(
> Dmrs.from_xmrs(simplemrs.load_one(source)).to_dict()
> )
> )
>
> (the Dmrs.from_xmrs(...) bit is just for Python2 compatibility. In
> Python3, you can just do: Dmrs.to_dict(simplemrs.loads_one(source)))
>
> DMRS-JSON -> MRS conversion is similar:
>
> ...
> print(
> simplemrs.dumps_one(
> Dmrs.from_dict(json.load(source))
> )
> )
>
> These methods require PyDelphin v0.6.0 (the latest release).
>
> On Tue, Feb 21, 2017 at 11:35 AM, Michael Wayne Goodman <goodmami at uw.edu>
> wrote:
>
>> Hi Megha,
>>
>> (I've re-CC'd the developers list so they can benefit or contribute;
>> please include them in follow-up replies)
>>
>> Thanks for clarifying.
>>
>> When you do MRS -> DMRS conversion in your script, it is essentially this:
>>
>> print(dmrx.dumps(simplemrs.load(source)))
>>
>> This loads the simplemrs-encoded source (e.g. a file or sys.stdin; or use
>> simplemrs.loads() for a string argument) into the internal *MRS
>> representation, then the dmrx codec serializes the internal representation
>> to DMRX. Doing DMRS -> MRS conversion is the same, but reversed:
>>
>> print(simplemrs.dumps(dmrx.load(source)))
>>
>> (More technically, the dmrx and simplemrs codecs decode the text
>> streams/strings and instantiate the Dmrs() and Mrs() classes, respectively,
>> in the delphin.mrs.xmrs module. It is these classes (and not the codecs
>> themselves) that do the actual conversion into the internal format.)
>>
>> However, there is a problem with the GG grammar and DMRS. The variable
>> properties prefixed by "--" (e.g. "--PSV") cause errors when loading a
>> DMRS. This is because the hyphen is not a valid initial character in an XML
>> attribute name (https://www.w3.org/TR/REC-xml/#NT-NameStartChar). It is
>> Python's XML parser, and not PyDelphin, that is failing to load the DMRX
>> instance. I suggest doing one of the following:
>>
>> 1. Change the attribute names in the GG grammar
>>
>> 2. In your conversion script, find and replace these attributes on the
>> MRS before converting to DMRS, and change them back in DMRS->MRS
>> conversion. You may use underscores (e.g. "__PSV") as the initial
>> character, according to the XML spec.
>>
>> Does this help?
>>
>> On Tue, Feb 21, 2017 at 1:13 AM, megha jain <jain11megha at gmail.com>
>> wrote:
>>
>>> Hello Michael.
>>>
>>> I know usage of Pydelphin so able to implement via this.
>>>
>>> I want to know which python code is being used by you to convert German
>>> DMRS into German MRS again?
>>>
>>> So that this MRS can be given ACE to generate corresponding German
>>> sentence.
>>>
>>> I am able to process : German sentence => MRS
>>> MRS => DMRS
>>> DMRS => MRS (that is my concern)
>>>
>>> EXAMPLE :-(A.) INPUT : Abrams bellte sehr leise.
>>>
>>> (B.) When I gave above one sentence to ACE , It generated following MRS
>>> :-
>>> (command used : ./ace -g ggp.dat -1Tf input_file.txt)
>>>
>>> Following file is attached below.
>>>
>>> (C) I gave this MRS as an input to mrs_to_dmrs-pp.py pyhon code and
>>> corresponding DMRS generated.
>>> Following file is attached below.
>>>
>>> (D.) After this I want to convert corresponding DMRS into MRS . Which
>>> python code comes in use for this approach?
>>>
>>>
>>> Hopefully I am able to make you understand what is my concern.
>>>
>>> Thank You.
>>>
>>>
>>
>>
>> --
>> Michael Wayne Goodman
>> Ph.D. Candidate, UW Linguistics
>>
>
>
>
> --
> Michael Wayne Goodman
> Ph.D. Candidate, UW Linguistics
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170222/09a0a040/attachment-0001.html>
More information about the developers
mailing list