[developers] Regarding to German sentence analysis.

Michael Wayne Goodman goodmami at uw.edu
Thu Mar 16 08:37:40 CET 2017


Hi Megha,

I'm afraid I don't know offhand a good tool for German morphological
analysis. Surely some exist, but I don't work with German data enough to
offer a recommendation. Perhaps someone on the developers list (re-CC'd,
again) knows more than me.

On Thu, Mar 16, 2017 at 12:30 AM, megha jain <jain11megha at gmail.com> wrote:

> Hello.
>
> Can you let me know tool which does German morphological analysis?
> Please share the link.
>
> Thank You.
>
> On Thu, Feb 23, 2017 at 1:36 PM, Michael Wayne Goodman <goodmami at uw.edu>
> wrote:
>
>> Hi Megha,
>>
>> As described in the top paragraph of the command-line tutorial (
>> https://github.com/delph-in/pydelphin/wiki/Command-line-Tutorial), the
>> `delphin` command will be available on your system if you installed
>> PyDelphin via pip (e.g. `pip install pydelphin`). If you instead cloned the
>> repository, the command is `delphin.sh` in the top PyDelphin directory.
>>
>>
>>
>> On Wed, Feb 22, 2017 at 11:17 PM, megha jain <jain11megha at gmail.com>
>> wrote:
>>
>>> Hello.
>>>
>>> I want to know what is delphin path in below command:
>>> sed 's/--/__/g' dmrs.txt | delphin convert --from dmrx --to simplemrs |
>>> sed 's/__/--/g'
>>>
>>> When I gave path home=>pydelphin=>delphin it doesn't find it.
>>>
>>> Thank You.
>>>
>>> On Thu, Feb 23, 2017 at 11:16 AM, megha jain <jain11megha at gmail.com>
>>> wrote:
>>>
>>>
>>>> ---------- Forwarded message ----------
>>>> From: Stephan Oepen <oe at ifi.uio.no>
>>>> Date: Thu, Feb 23, 2017 at 3:21 AM
>>>> Subject: Re: [developers] Regarding to German sentence analysis.
>>>> To: Ann Copestake <aac10 at cl.cam.ac.uk>, Michael Wayne Goodman <
>>>> goodmami at uw.edu>
>>>> Cc: developers <developers at delph-in.net>, megha jain <
>>>> jain11megha at gmail.com>
>>>>
>>>>
>>>> dear all,
>>>>
>>>> no need to add VPM rules which unconditionally delete: that will be the
>>>> default behavior for any variable properties for which no mapping is
>>>> defined.  mike, could you update your addition to RmrsVpm in this light?
>>>>
>>>> however, GG has explicit VPM rules for --PSV and several others that
>>>> start in double hyphens, so it would seem berthold actually wants these in
>>>> the external interface.
>>>>
>>>> i too was under the impression that the double-hyphen prefix typically
>>>> indicates something grammar-internal, but that could just be an
>>>> ERG-specific convention.
>>>>
>>>> either way, it would seem sad to formally disallow initial hyphens in
>>>> MRS variable properties because they cause problems for some serializations
>>>> of some derived representations.  i would rather look for a
>>>> backwards-compatible extension of the DMRX and RMRX schemas then, to deal
>>>> with the full range of identifiers supported in grammars and native MRSs.
>>>>
>>>> all best, oe
>>>>
>>>>
>>>> On Wed 22 Feb 2017 at 19:17 Michael Wayne Goodman <goodmami at uw.edu>
>>>> wrote:
>>>>
>>>> Thanks, Ann. Francis said (offline) something similar about those being
>>>> grammar-internal properties. I've updated the RmrsVpm wiki with an example
>>>> of how to remove them via a VPM (http://moin.delph-in.net/Rmrs
>>>> Vpm#Corner_Cases). Is there any better place to document this
>>>> information?
>>>>
>>>> And I also thought it might affect MRS's XML format as well, but then I
>>>> remembered that property names are stored as element text instead of
>>>> attribute names, so they don't suffer the same problem:
>>>>
>>>> ...
>>>> <extrapair><path>--PSV</path><value>non-apsv</value></extrapair>
>>>> ...
>>>>
>>>> It would affect the RMRX format, though.
>>>>
>>>> On Feb 22, 2017 04:42, "Ann Copestake" <aac10 at cl.cam.ac.uk> wrote:
>>>>
>>>> from memory, the interpretation of the "--" was intended to be that
>>>> this was something that should not appear in the external MRS
>>>>
>>>> in any case, it wouldn't be a DMRS issue, as such, since presumably it
>>>> could apply to any of the MRS XML formats
>>>>
>>>> All best,
>>>>
>>>> Ann
>>>>
>>>> On 21/02/17 20:21, Michael Wayne Goodman wrote:
>>>>
>>>> Ann:
>>>>  Does the LKB do anything special regarding properties like --PSV when
>>>> reading/writing DMRX? They aren't ill-formed in the SimpleMRS format, so
>>>> I'm wondering if PyDelphin should attempt to do anything special for these.
>>>>
>>>> Megha:
>>>>   I just thought of another alternative. You can serialize to the
>>>> DMRS-JSON format instead of DMRX. JSON doesn't have the same attribute name
>>>> constraints as XML. The process is slightly different. Here is
>>>> MRS->DMRS-JSON conversion:
>>>>
>>>>     import json
>>>>     from delphin.mrs import simplemrs
>>>>     from delphin.mrs.xmrs import Dmrs
>>>>     print(
>>>>         json.dumps(
>>>>             Dmrs.from_xmrs(simplemrs.load_one(source)).to_dict()
>>>>         )
>>>>     )
>>>>
>>>> (the Dmrs.from_xmrs(...) bit is just for Python2 compatibility. In
>>>> Python3, you can just do: Dmrs.to_dict(simplemrs.loads_one(source)))
>>>>
>>>> DMRS-JSON -> MRS conversion is similar:
>>>>
>>>>     ...
>>>>     print(
>>>>         simplemrs.dumps_one(
>>>>             Dmrs.from_dict(json.load(source))
>>>>         )
>>>>     )
>>>>
>>>> These methods require PyDelphin v0.6.0 (the latest release).
>>>>
>>>> On Tue, Feb 21, 2017 at 11:35 AM, Michael Wayne Goodman <
>>>> goodmami at uw.edu> wrote:
>>>>
>>>> Hi Megha,
>>>>
>>>> (I've re-CC'd the developers list so they can benefit or contribute;
>>>> please include them in follow-up replies)
>>>>
>>>> Thanks for clarifying.
>>>>
>>>> When you do MRS -> DMRS conversion in your script, it is essentially
>>>> this:
>>>>
>>>>     print(dmrx.dumps(simplemrs.load(source)))
>>>>
>>>> This loads the simplemrs-encoded source (e.g. a file or sys.stdin; or
>>>> use simplemrs.loads() for a string argument) into the internal *MRS
>>>> representation, then the dmrx codec serializes the internal representation
>>>> to DMRX. Doing DMRS -> MRS conversion is the same, but reversed:
>>>>
>>>>     print(simplemrs.dumps(dmrx.load(source)))
>>>>
>>>> (More technically, the dmrx and simplemrs codecs decode the text
>>>> streams/strings and instantiate the Dmrs() and Mrs() classes, respectively,
>>>> in the delphin.mrs.xmrs module. It is these classes (and not the codecs
>>>> themselves) that do the actual conversion into the internal format.)
>>>>
>>>> However, there is a problem with the GG grammar and DMRS. The variable
>>>> properties prefixed by "--" (e.g. "--PSV") cause errors when loading a
>>>> DMRS. This is because the hyphen is not a valid initial character in an XML
>>>> attribute name (https://www.w3.org/TR/REC-xml/#NT-NameStartChar). It
>>>> is Python's XML parser, and not PyDelphin, that is failing to load the DMRX
>>>> instance. I suggest doing one of the following:
>>>>
>>>>  1. Change the attribute names in the GG grammar
>>>>
>>>>  2. In your conversion script, find and replace these attributes on the
>>>> MRS before converting to DMRS, and change them back in DMRS->MRS
>>>> conversion. You may use underscores (e.g. "__PSV") as the initial
>>>> character, according to the XML spec.
>>>>
>>>> Does this help?
>>>>
>>>> On Tue, Feb 21, 2017 at 1:13 AM, megha jain <jain11megha at gmail.com>
>>>> wrote:
>>>>
>>>> Hello Michael.
>>>>
>>>> I know usage of Pydelphin so able to implement via this.
>>>>
>>>> I want to know which python code is being used by you to convert German
>>>> DMRS into German MRS again?
>>>>
>>>> So that this MRS can be given ACE to generate corresponding German
>>>> sentence.
>>>>
>>>> I am able to process : German sentence => MRS
>>>>                                  MRS => DMRS
>>>>                                  DMRS => MRS (that is my concern)
>>>>
>>>> EXAMPLE :-(A.)  INPUT : Abrams bellte sehr leise.
>>>>
>>>> (B.) When I gave above one sentence to ACE , It generated following MRS
>>>> :-
>>>> (command used : ./ace -g ggp.dat -1Tf input_file.txt)
>>>>
>>>> Following file is attached below.
>>>>
>>>> (C) I gave this MRS as an input to mrs_to_dmrs-pp.py pyhon code and
>>>> corresponding DMRS generated.
>>>> Following file is attached below.
>>>>
>>>> (D.) After this I want to convert corresponding DMRS into MRS . Which
>>>> python code comes in use for this approach?
>>>>
>>>>
>>>> Hopefully I am able to make you understand what is my concern.
>>>>
>>>> Thank You.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael Wayne Goodman
>>>> Ph.D. Candidate, UW Linguistics
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael Wayne Goodman
>>>> Ph.D. Candidate, UW Linguistics
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Michael Wayne Goodman
>> Ph.D. Candidate, UW Linguistics
>>
>
>


-- 
Michael Wayne Goodman
Ph.D. Candidate, UW Linguistics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170316/bc20a7d0/attachment-0001.html>


More information about the developers mailing list