[developers] Regarding to German sentence analysis.
Ann Copestake
aac10 at cl.cam.ac.uk
Wed Feb 22 21:28:31 CET 2017
Thanks - that seems sensible. I think that as long as there are
appropriate search terms in place (e.g., so someone searching for XML
will find the page), I don't think it matters too much where it is on
the wiki,
Ann
On 22/02/2017 18:17, Michael Wayne Goodman wrote:
> Thanks, Ann. Francis said (offline) something similar about those
> being grammar-internal properties. I've updated the RmrsVpm wiki with
> an example of how to remove them via a VPM
> (http://moin.delph-in.net/RmrsVpm#Corner_Cases). Is there any better
> place to document this information?
>
> And I also thought it might affect MRS's XML format as well, but then
> I remembered that property names are stored as element text instead of
> attribute names, so they don't suffer the same problem:
>
> ...
> <extrapair><path>--PSV</path><value>non-apsv</value></extrapair>
> ...
>
> It would affect the RMRX format, though.
>
> On Feb 22, 2017 04:42, "Ann Copestake" <aac10 at cl.cam.ac.uk
> <mailto:aac10 at cl.cam.ac.uk>> wrote:
>
> from memory, the interpretation of the "--" was intended to be
> that this was something that should not appear in the external MRS
>
> in any case, it wouldn't be a DMRS issue, as such, since
> presumably it could apply to any of the MRS XML formats
>
> All best,
>
> Ann
>
>
> On 21/02/17 20:21, Michael Wayne Goodman wrote:
>> Ann:
>> Does the LKB do anything special regarding properties like --PSV
>> when reading/writing DMRX? They aren't ill-formed in the
>> SimpleMRS format, so I'm wondering if PyDelphin should attempt to
>> do anything special for these.
>>
>> Megha:
>> I just thought of another alternative. You can serialize to the
>> DMRS-JSON format instead of DMRX. JSON doesn't have the same
>> attribute name constraints as XML. The process is slightly
>> different. Here is MRS->DMRS-JSON conversion:
>>
>> import json
>> from delphin.mrs import simplemrs
>> from delphin.mrs.xmrs import Dmrs
>> print(
>> json.dumps(
>> Dmrs.from_xmrs(simplemrs.load_one(source)).to_dict()
>> )
>> )
>>
>> (the Dmrs.from_xmrs(...) bit is just for Python2 compatibility.
>> In Python3, you can just do:
>> Dmrs.to_dict(simplemrs.loads_one(source)))
>>
>> DMRS-JSON -> MRS conversion is similar:
>>
>> ...
>> print(
>> simplemrs.dumps_one(
>> Dmrs.from_dict(json.load(source))
>> )
>> )
>>
>> These methods require PyDelphin v0.6.0 (the latest release).
>>
>> On Tue, Feb 21, 2017 at 11:35 AM, Michael Wayne Goodman
>> <goodmami at uw.edu <mailto:goodmami at uw.edu>> wrote:
>>
>> Hi Megha,
>>
>> (I've re-CC'd the developers list so they can benefit or
>> contribute; please include them in follow-up replies)
>>
>> Thanks for clarifying.
>>
>> When you do MRS -> DMRS conversion in your script, it is
>> essentially this:
>>
>> print(dmrx.dumps(simplemrs.load(source)))
>>
>> This loads the simplemrs-encoded source (e.g. a file or
>> sys.stdin; or use simplemrs.loads() for a string argument)
>> into the internal *MRS representation, then the dmrx codec
>> serializes the internal representation to DMRX. Doing DMRS ->
>> MRS conversion is the same, but reversed:
>>
>> print(simplemrs.dumps(dmrx.load(source)))
>>
>> (More technically, the dmrx and simplemrs codecs decode the
>> text streams/strings and instantiate the Dmrs() and Mrs()
>> classes, respectively, in the delphin.mrs.xmrs module. It is
>> these classes (and not the codecs themselves) that do the
>> actual conversion into the internal format.)
>>
>> However, there is a problem with the GG grammar and DMRS. The
>> variable properties prefixed by "--" (e.g. "--PSV") cause
>> errors when loading a DMRS. This is because the hyphen is not
>> a valid initial character in an XML attribute name
>> (https://www.w3.org/TR/REC-xml/#NT-NameStartChar
>> <https://www.w3.org/TR/REC-xml/#NT-NameStartChar>). It is
>> Python's XML parser, and not PyDelphin, that is failing to
>> load the DMRX instance. I suggest doing one of the following:
>>
>> 1. Change the attribute names in the GG grammar
>>
>> 2. In your conversion script, find and replace these
>> attributes on the MRS before converting to DMRS, and change
>> them back in DMRS->MRS conversion. You may use underscores
>> (e.g. "__PSV") as the initial character, according to the XML
>> spec.
>>
>> Does this help?
>>
>> On Tue, Feb 21, 2017 at 1:13 AM, megha jain
>> <jain11megha at gmail.com <mailto:jain11megha at gmail.com>> wrote:
>>
>> Hello Michael.
>>
>> I know usage of Pydelphin so able to implement via this.
>>
>> I want to know which python code is being used by you to
>> convert German DMRS into German MRS again?
>>
>> So that this MRS can be given ACE to generate
>> corresponding German sentence.
>>
>> I am able to process : German sentence => MRS
>> MRS => DMRS
>> DMRS => MRS (that is my concern)
>>
>> EXAMPLE :-(A.) INPUT : Abrams bellte sehr leise.
>>
>> (B.) When I gave above one sentence to ACE , It generated
>> following MRS :-
>> (command used : ./ace -g ggp.dat -1Tf input_file.txt)
>>
>> Following file is attached below.
>>
>> (C) I gave this MRS as an input to mrs_to_dmrs-pp.py
>> pyhon code and corresponding DMRS generated.
>> Following file is attached below.
>>
>> (D.) After this I want to convert corresponding DMRS into
>> MRS . Which python code comes in use for this approach?
>>
>>
>> Hopefully I am able to make you understand what is my
>> concern.
>>
>> Thank You.
>>
>>
>>
>>
>> --
>> Michael Wayne Goodman
>> Ph.D. Candidate, UW Linguistics
>>
>>
>>
>>
>> --
>> Michael Wayne Goodman
>> Ph.D. Candidate, UW Linguistics
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170222/ee008f2a/attachment.html>
More information about the developers
mailing list