[developers] STRINGPRED in MRS
Tuấn Anh Lê
tuananh.ke at gmail.com
Sat Jul 29 02:00:42 CEST 2017
Hi Mike, I use it for exactly that reason. Very often I find myself want to
edit DMRS in a quick way just to see how a DMRS looks like if I do some
minor changes. Also, this is done in a web interface, not a Python console.
XML, JSON or DMRS object is quite troublesome for something like that.
In this example I added a weird queue_rel to the end of the sentence.
[image: Inline images 1]
On 29 July 2017 at 05:32, Michael Wayne Goodman <goodmami at uw.edu> wrote:
> Hi Tuan Anh,
>
> I'm glad you found the source of the problem. But can I ask why you are
> using PyDelphin's SimpleDMRS in this way? The documentation (
> https://github.com/delph-in/pydelphin/wiki/delphin.mrs.simpledmrs) states
> that it is only meant as an export format for human consumption as it's
> easier to read than XML or JSON, but it is not intended to be a supported
> DMRS codec.
>
> I ask because I found that Jan Buys had also used this format in his work
> with DMRS. I don't want to promote yet another format variant for our
> software to support, but if people really want to use it in processing I
> should provide a way to read it back into PyDelphin.
>
> On Thu, Jul 27, 2017 at 11:33 PM, Tuấn Anh Lê <tuananh.ke at gmail.com>
> wrote:
>
>> I found the problem. It was not the transformations, but the manual
>> string editing that stole my CARGs. When I edit DMRS manually, I convert
>> DMRSes to strings first, and then convert them back to DMRS objects later.
>> This is what I meant by "strings"
>>
>> dmrs {
>> 10000 [def_explicit_q<0:2> x pers=3 num=sg ind=+];
>> 10001 [poss<0:2> e tense=untensed prog=- perf=- mood=indicative sf=prop];
>> 10002 [pronoun_q<0:2> x pers=1 num=sg pt=std];
>> 10003 [pron<0:2> x pers=1 num=sg pt=std];
>> 10004 [_name_n_of_rel<3:7> x pers=3 num=sg ind=+];
>> 10005 [_be_v_id_rel<8:10> e tense=pres prog=- perf=- mood=indicative sf=prop];
>> 10006 [udef_q<11:20> x pers=3 num=pl ind=+];
>> 10007 [named<11:20>("Abraham") x pers=3 num=pl ind=+];
>> 0:/H -> 10005;
>> 10000:RSTR/H -> 10004;
>> 10001:ARG2/NEQ -> 10003;
>> 10001:ARG1/EQ -> 10004;
>> 10002:RSTR/H -> 10003;
>> 10005:ARG1/NEQ -> 10004;
>> 10005:ARG2/NEQ -> 10007;
>> 10006:RSTR/H -> 10007;
>> }
>>
>> When I convert them back, I forgot that CARGs in the brackets after
>> cfrom:cto, and not in the attribute list (pers=3 num=pl ind=+).
>>
>> On 28 July 2017 at 02:10, Michael Wayne Goodman <goodmami at uw.edu> wrote:
>>
>>> Forgive me, Tuan Anh, for bringing the discussion back to the list...
>>>
>>> On Thu, Jul 27, 2017 at 9:25 AM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>>> wrote:
>>>
>>>> Hi Mike, thank so much. I'm still debugging to see why the CARGS
>>>> disappeared. I'm supporting different formats now and in the middle of
>>>> converting back and forth something weird happened. I suspect it's my code
>>>> that caused the problem. My parsing flow right now is pyDelphin/ACE => XMRS
>>>> => XML/JSON/string (for manual editing).
>>>>
>>>> If I did lose the CARGS, is it possible to add them back in
>>>> automatically given that I have this MRS and the sentence text or I have to
>>>> write code to do that?
>>>>
>>>
>>> Unless I see your transformations I can't really pinpoint where CARGS is
>>> being lost. So instead I'll confirm that PyDelphin doesn't drop them during
>>> the conversions you mentioned:
>>>
>>> >>> from delphin.interfaces import ace
>>> >>> from delphin.mrs import xmrs, dmrx
>>> >>> r = ace.parse(
>>> ... '/home/goodmami/grammars/erg-1214/erg-1214.dat',
>>> ... 'My name is Sherlock Holmes.')
>>> NOTE: parsed 1 / 1 sentences, avg 3668k, time 0.12449s
>>> >>> x = r.result(0).mrs() # ACE => XMRS
>>> >>> j = xmrs.Dmrs.to_dict(x) # XMRS => JSON
>>> >>> x2 = xmrs.Dmrs.from_dict(j) # JSON => XMRS
>>> >>> x2.ep(10009).carg
>>> 'Sherlock'
>>> >>> d = dmrx.dumps_one(x) # XMRS => XML
>>> >>> x3 = dmrx.loads_one(d) # XML => XMRS
>>> >>> x3.ep('10009').carg # nodeid became string from conversion
>>> 'Sherlock'
>>>
>>> I'm not sure what you meant by converting to "string". But I think it's
>>> your transformations that are causing the lost CARGs. Are you able to say
>>> what those transformations do?
>>>
>>> "My name is Sherlock Holmes.
>>>>
>>>> [ TOP: h0
>>>> RELS: < [ def_explicit_q_rel<0:3> LBL: h1 ARG0: x12 [ x NUM: sg IND: + PERS: 3 ] RSTR: h17 ]
>>>> [ poss_rel<0:3> LBL: h2 ARG0: e10 [ e TENSE: untensed MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x12 ARG2: x11 [ x NUM: sg PERS: 1 ] ]
>>>> [ pronoun_q_rel<0:3> LBL: h3 ARG0: x11 RSTR: h18 ]
>>>> [ pron_rel<0:3> LBL: h4 ARG0: x11 ]
>>>> [ _name_n_of_rel<4:8> LBL: h2 ARG0: x12 ]
>>>> [ _be_v_id_rel<9:11> LBL: h5 ARG0: e13 [ e TENSE: pres MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x12 ARG2: x16 [ x NUM: sg IND: + PERS: 3 ] ]
>>>> [ proper_q_rel<12:28> LBL: h6 ARG0: x16 RSTR: h19 ]
>>>> [ compound_rel<12:28> LBL: h7 ARG0: e14 [ e TENSE: untensed MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x16 ARG2: x15 [ x NUM: sg IND: + PERS: 3 ] ]
>>>> [ proper_q_rel<12:20> LBL: h8 ARG0: x15 RSTR: h20 ] [ named_rel<12:20> LBL: h9 ARG0: x15 ]
>>>> [ named_rel<21:28> LBL: h7 ARG0: x16 ] >
>>>> HCONS: < h0 qeq h5 h17 qeq h2 h18 qeq h4 h19 qeq h7 h20 qeq h9 > ]
>>>>
>>>>
>>>> On 27 July 2017 at 12:39, Michael Wayne Goodman <goodmami at uw.edu>
>>>> wrote:
>>>>
>>>>> On Wed, Jul 26, 2017 at 6:42 PM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> Thanks for answering :D. I think I misunderstood STRINGPRED flag in
>>>>>> pyDelphin. I always think that there are 2 types of predicates, namely
>>>>>> GRAMMARPRED and REALPRED. However I have encountered two different types of
>>>>>> GRAMMARPRED, the one without cargs (such as pronoun_q_rel) and the one with
>>>>>> carg (such as named_rel). I thought STRINGPRED was used to represent the
>>>>>> GRAMMARPRED with CARG.
>>>>>>
>>>>>
>>>>> Ah I see. Unfortunately there is no predicate subcategorization that
>>>>> indicates the presence of a constant argument. But, fortunately, there is
>>>>> only one argument role for constant arguments (CARG), and nobody that I'm
>>>>> aware of has chosen to rename it or add additional ones. The LKB and PET
>>>>> allowed it to be customized via a *value-feats* parameter, but ACE just
>>>>> assumes it will be "CARG", and nobody's complained about that yet. So I
>>>>> think it's safe to assume it will be called "CARG". This assumption makes
>>>>> the following easier...
>>>>>
>>>>> Right now I'm transforming DMRS using pyDelphin and I'm looking for a
>>>>>> way to find GRAMMARPRED with CARG because after each transformation
>>>>>> pyDelphin take my CARGS out and hide them some where. I need the CARGS for
>>>>>> other mapping though. Is there a way to accomplish this?
>>>>>>
>>>>>
>>>>> When working with Xmrs structures in PyDelphin, just look for "CARG"
>>>>> on an EP's arguments. E.g. for some Xmrs object x that represents the
>>>>> sentence "Abrams slept." where the named_rel EP has nodeid 10001:
>>>>>
>>>>> >>> x.ep(10001).args.get('CARG')
>>>>> 'Abrams'
>>>>>
>>>>> For convenience, I also define a "carg" property on EPs:
>>>>>
>>>>> >>> x.ep(10001).carg
>>>>> 'Abrams'
>>>>>
>>>>> These are essentially equivalent. If you just want to test for the
>>>>> presence of the CARG, you can do this:
>>>>>
>>>>> >>> 'CARG' in x.ep(10001).args
>>>>> True
>>>>> >>> 'CARG' in x.ep(10000).args # 10000 is for proper_q
>>>>> False
>>>>>
>>>>> In DMRS, nodes do not contain information about the arguments the node
>>>>> participates in; this information is stored in the links. However, CARGs
>>>>> are special in that they become node attributes instead of links:
>>>>>
>>>>> >>> from delphin.mrs.components import nodes
>>>>> >>> ns = {n.nodeid: n for n in nodes(x)}
>>>>> >>> ns[10001].carg
>>>>> 'Abrams'
>>>>>
>>>>> If you try to recreate an Xmrs from Dmrs nodes and don't give it any
>>>>> links, you will only get intrinsic arguments (ARG0s) and CARGs, and any
>>>>> other arguments will be lost:
>>>>>
>>>>> >>> from delphin.mrs.xmrs import Dmrs
>>>>> >>> d = Dmrs(nodes=ns.values())
>>>>> >>> d.ep(10001).args
>>>>> {'CARG': 'Abrams', 'ARG0': 'x5'}
>>>>> >>> d.ep(10002).args # 10002 is for "slept"
>>>>> {'ARG0': 'e6'}
>>>>>
>>>>> If you also give it the links, it should be able to recreate the
>>>>> original (x)mrs.
>>>>>
>>>>> So I'm not sure what kind of transformation you are doing would cause
>>>>> the CARG to be lost. Can you elaborate or provide a MWE (minimal working
>>>>> example) that shows when the CARG is lost/hidden?
>>>>>
>>>>>
>>>>>>
>>>>>> On 27 July 2017 at 03:45, Michael Wayne Goodman <goodmami at uw.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Tuan Anh,
>>>>>>>
>>>>>>> The thinking of subtypes of predicates has changed over the years,
>>>>>>> and as a result there's numerous overlapping terms for various things. I
>>>>>>> believe that the current consensus (documented on
>>>>>>> http://moin.delph-in.net/PredicateRfc) is that there are two main
>>>>>>> axes of subcategorization for predicate symbols: abstract vs surface and
>>>>>>> string vs type (with "real" predicates being a decomposed form of surface
>>>>>>> preds). But I suspect what you mean by STRINGPRED is not the same thing as
>>>>>>> described on the wiki?
>>>>>>>
>>>>>>> I'm afraid PyDelphin is a bit behind the times WRT these definitions
>>>>>>> of predicates, and instead follows mostly what was described in the MRS DTD
>>>>>>> (as I understood it at the time). Therefore currently PyDelphin calls
>>>>>>> predicates beginning with an underscore a "stringpred", regardless of
>>>>>>> whether the symbol was quoted or not. I.e., it should be called
>>>>>>> "surfacepred" or something. Abstract predicates are called "grammarpred" in
>>>>>>> PyDelphin. I have created a ticket for this bug:
>>>>>>> https://github.com/delph-in/pydelphin/issues/117.
>>>>>>>
>>>>>>> Now back to your question, "named_rel" is called a GRAMMARPRED in
>>>>>>> PyDelphin because the predicate does not start with an underscore.
>>>>>>>
>>>>>>> On Wed, Jul 26, 2017 at 8:25 AM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi delphinians,
>>>>>>>>
>>>>>>>> I'm working on predicate mapping and found that named_rel are not
>>>>>>>> STRINGPRED when processed by pyDelphin.Are these expected behaviours? Can
>>>>>>>> someone please shed some light for me on this matter :D Thank you.
>>>>>>>>
>>>>>>>> REALPRED 1
>>>>>>>> STRINGPRED 2
>>>>>>>> <ElementaryPredication object (pron_rel (x24)) at 140241381023560> 0
>>>>>>>> <ElementaryPredication object (pronoun_q_rel (x24)) at
>>>>>>>> 140241380610120> 0
>>>>>>>> <ElementaryPredication object (_have_v_1_rel (e25)) at
>>>>>>>> 140241380610240> 1
>>>>>>>> <ElementaryPredication object (_the_q_rel (x26)) at
>>>>>>>> 140241380610360> 1
>>>>>>>> <ElementaryPredication object (_pleasure_n_of_rel (x26)) at
>>>>>>>> 140241380610480> 1
>>>>>>>> <ElementaryPredication object (udef_q_rel (x27)) at
>>>>>>>> 140241380610600> 0
>>>>>>>> <ElementaryPredication object (nominalization_rel (x27)) at
>>>>>>>> 140241380610720> 0
>>>>>>>> <ElementaryPredication object (_make_v_1_rel (e28)) at
>>>>>>>> 140241380610840> 1
>>>>>>>> <ElementaryPredication object (_the_q_rel (x29)) at
>>>>>>>> 140241380610960> 1
>>>>>>>> <ElementaryPredication object (_doctor_n_1_rel (x29)) at
>>>>>>>> 140241380611080> 1
>>>>>>>> <ElementaryPredication object (def_explicit_q_rel (x31)) at
>>>>>>>> 140241380611200> 0
>>>>>>>> <ElementaryPredication object (poss_rel (e30)) at 140241380611320> 0
>>>>>>>> <ElementaryPredication object (_acquaintance_n_1_rel (x31)) at
>>>>>>>> 140241380611440> 1
>>>>>>>> <ElementaryPredication object (_say_v_to_rel (e32)) at
>>>>>>>> 140241380611560> 1
>>>>>>>> <ElementaryPredication object (proper_q_rel (x33)) at
>>>>>>>> 140241380611680> 0
>>>>>>>> *<ElementaryPredication object (named_rel (x33)) at
>>>>>>>> 140241380611800> 0*
>>>>>>>> <ElementaryPredication object (_and_c_rel (e34)) at
>>>>>>>> 140241380611920> 1
>>>>>>>> <ElementaryPredication object (_in_p_rel (e35)) at 140241380612040>
>>>>>>>> 1
>>>>>>>> <ElementaryPredication object (udef_q_rel (x37)) at
>>>>>>>> 140241380612160> 0
>>>>>>>> <ElementaryPredication object (_a+few_a_1_rel (e36)) at
>>>>>>>> 140241380612280> 1
>>>>>>>> <ElementaryPredication object (_word_n_of_rel (x37)) at
>>>>>>>> 140241380612400> 1
>>>>>>>> <ElementaryPredication object (pron_rel (x38)) at 140241380612520> 0
>>>>>>>> <ElementaryPredication object (pronoun_q_rel (x38)) at
>>>>>>>> 140241380612640> 0
>>>>>>>> <ElementaryPredication object (_sketch_v_1_rel (e39)) at
>>>>>>>> 140241380612760> 1
>>>>>>>> <ElementaryPredication object (_out_p_rel (e40)) at
>>>>>>>> 140241380612880> 1
>>>>>>>> <ElementaryPredication object (free_relative_q_rel (x41)) at
>>>>>>>> 140241380613000> 0
>>>>>>>> <ElementaryPredication object (thing_rel (x41)) at 140241380613120>
>>>>>>>> 0
>>>>>>>> <ElementaryPredication object (_occur_v_to_rel (e42)) at
>>>>>>>> 140241380613240> 1
>>>>>>>>
>>>>>>>>
>>>>>>>> Yours,
>>>>>>>> --
>>>>>>>> Tuan Anh Le
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Michael Wayne Goodman
>>>>>>> Ph.D. Candidate, UW Linguistics
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Yours,
>>>>>> --
>>>>>> Tuan Anh Le
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Michael Wayne Goodman
>>>>> Ph.D. Candidate, UW Linguistics
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Yours,
>>>> --
>>>> Tuan Anh Le
>>>>
>>>
>>>
>>>
>>> --
>>> Michael Wayne Goodman
>>> Ph.D. Candidate, UW Linguistics
>>>
>>
>>
>>
>> --
>> Yours,
>> --
>> Tuan Anh Le
>>
>
>
>
> --
> Michael Wayne Goodman
> Ph.D. Candidate, UW Linguistics
>
--
Yours,
--
Tuan Anh Le
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170729/91a009e2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2017-07-29 07-58-32.png
Type: image/png
Size: 82061 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20170729/91a009e2/attachment-0001.png>
More information about the developers
mailing list