[developers] Regarding to German sentence analysis.
Michael Wayne Goodman
goodmami at uw.edu
Tue Feb 21 21:21:47 CET 2017
Ann:
Does the LKB do anything special regarding properties like --PSV when
reading/writing DMRX? They aren't ill-formed in the SimpleMRS format, so
I'm wondering if PyDelphin should attempt to do anything special for these.
Megha:
I just thought of another alternative. You can serialize to the DMRS-JSON
format instead of DMRX. JSON doesn't have the same attribute name
constraints as XML. The process is slightly different. Here is
MRS->DMRS-JSON conversion:
import json
from delphin.mrs import simplemrs
from delphin.mrs.xmrs import Dmrs
print(
json.dumps(
Dmrs.from_xmrs(simplemrs.load_one(source)).to_dict()
)
)
(the Dmrs.from_xmrs(...) bit is just for Python2 compatibility. In Python3,
you can just do: Dmrs.to_dict(simplemrs.loads_one(source)))
DMRS-JSON -> MRS conversion is similar:
...
print(
simplemrs.dumps_one(
Dmrs.from_dict(json.load(source))
)
)
These methods require PyDelphin v0.6.0 (the latest release).
On Tue, Feb 21, 2017 at 11:35 AM, Michael Wayne Goodman <goodmami at uw.edu>
wrote:
> Hi Megha,
>
> (I've re-CC'd the developers list so they can benefit or contribute;
> please include them in follow-up replies)
>
> Thanks for clarifying.
>
> When you do MRS -> DMRS conversion in your script, it is essentially this:
>
> print(dmrx.dumps(simplemrs.load(source)))
>
> This loads the simplemrs-encoded source (e.g. a file or sys.stdin; or use
> simplemrs.loads() for a string argument) into the internal *MRS
> representation, then the dmrx codec serializes the internal representation
> to DMRX. Doing DMRS -> MRS conversion is the same, but reversed:
>
> print(simplemrs.dumps(dmrx.load(source)))
>
> (More technically, the dmrx and simplemrs codecs decode the text
> streams/strings and instantiate the Dmrs() and Mrs() classes, respectively,
> in the delphin.mrs.xmrs module. It is these classes (and not the codecs
> themselves) that do the actual conversion into the internal format.)
>
> However, there is a problem with the GG grammar and DMRS. The variable
> properties prefixed by "--" (e.g. "--PSV") cause errors when loading a
> DMRS. This is because the hyphen is not a valid initial character in an XML
> attribute name (https://www.w3.org/TR/REC-xml/#NT-NameStartChar). It is
> Python's XML parser, and not PyDelphin, that is failing to load the DMRX
> instance. I suggest doing one of the following:
>
> 1. Change the attribute names in the GG grammar
>
> 2. In your conversion script, find and replace these attributes on the
> MRS before converting to DMRS, and change them back in DMRS->MRS
> conversion. You may use underscores (e.g. "__PSV") as the initial
> character, according to the XML spec.
>
> Does this help?
>
> On Tue, Feb 21, 2017 at 1:13 AM, megha jain <jain11megha at gmail.com> wrote:
>
>> Hello Michael.
>>
>> I know usage of Pydelphin so able to implement via this.
>>
>> I want to know which python code is being used by you to convert German
>> DMRS into German MRS again?
>>
>> So that this MRS can be given ACE to generate corresponding German
>> sentence.
>>
>> I am able to process : German sentence => MRS
>> MRS => DMRS
>> DMRS => MRS (that is my concern)
>>
>> EXAMPLE :-(A.) INPUT : Abrams bellte sehr leise.
>>
>> (B.) When I gave above one sentence to ACE , It generated following MRS :-
>> (command used : ./ace -g ggp.dat -1Tf input_file.txt)
>>
>> Following file is attached below.
>>
>> (C) I gave this MRS as an input to mrs_to_dmrs-pp.py pyhon code and
>> corresponding DMRS generated.
>> Following file is attached below.
>>
>> (D.) After this I want to convert corresponding DMRS into MRS . Which
>> python code comes in use for this approach?
>>
>>
>> Hopefully I am able to make you understand what is my concern.
>>
>> Thank You.
>>
>>
>
>
> --
> Michael Wayne Goodman
> Ph.D. Candidate, UW Linguistics
>
--
Michael Wayne Goodman
Ph.D. Candidate, UW Linguistics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170221/0b8bcb4c/attachment.html>
More information about the developers
mailing list