[developers] STRINGPRED in MRS
Tuấn Anh Lê
tuananh.ke at gmail.com
Fri Jul 28 08:33:23 CEST 2017
I found the problem. It was not the transformations, but the manual string
editing that stole my CARGs. When I edit DMRS manually, I convert DMRSes to
strings first, and then convert them back to DMRS objects later. This is
what I meant by "strings"
dmrs {
10000 [def_explicit_q<0:2> x pers=3 num=sg ind=+];
10001 [poss<0:2> e tense=untensed prog=- perf=- mood=indicative sf=prop];
10002 [pronoun_q<0:2> x pers=1 num=sg pt=std];
10003 [pron<0:2> x pers=1 num=sg pt=std];
10004 [_name_n_of_rel<3:7> x pers=3 num=sg ind=+];
10005 [_be_v_id_rel<8:10> e tense=pres prog=- perf=- mood=indicative sf=prop];
10006 [udef_q<11:20> x pers=3 num=pl ind=+];
10007 [named<11:20>("Abraham") x pers=3 num=pl ind=+];
0:/H -> 10005;
10000:RSTR/H -> 10004;
10001:ARG2/NEQ -> 10003;
10001:ARG1/EQ -> 10004;
10002:RSTR/H -> 10003;
10005:ARG1/NEQ -> 10004;
10005:ARG2/NEQ -> 10007;
10006:RSTR/H -> 10007;
}
When I convert them back, I forgot that CARGs in the brackets after
cfrom:cto, and not in the attribute list (pers=3 num=pl ind=+).
On 28 July 2017 at 02:10, Michael Wayne Goodman <goodmami at uw.edu> wrote:
> Forgive me, Tuan Anh, for bringing the discussion back to the list...
>
> On Thu, Jul 27, 2017 at 9:25 AM, Tuấn Anh Lê <tuananh.ke at gmail.com> wrote:
>
>> Hi Mike, thank so much. I'm still debugging to see why the CARGS
>> disappeared. I'm supporting different formats now and in the middle of
>> converting back and forth something weird happened. I suspect it's my code
>> that caused the problem. My parsing flow right now is pyDelphin/ACE => XMRS
>> => XML/JSON/string (for manual editing).
>>
>> If I did lose the CARGS, is it possible to add them back in automatically
>> given that I have this MRS and the sentence text or I have to write code to
>> do that?
>>
>
> Unless I see your transformations I can't really pinpoint where CARGS is
> being lost. So instead I'll confirm that PyDelphin doesn't drop them during
> the conversions you mentioned:
>
> >>> from delphin.interfaces import ace
> >>> from delphin.mrs import xmrs, dmrx
> >>> r = ace.parse(
> ... '/home/goodmami/grammars/erg-1214/erg-1214.dat',
> ... 'My name is Sherlock Holmes.')
> NOTE: parsed 1 / 1 sentences, avg 3668k, time 0.12449s
> >>> x = r.result(0).mrs() # ACE => XMRS
> >>> j = xmrs.Dmrs.to_dict(x) # XMRS => JSON
> >>> x2 = xmrs.Dmrs.from_dict(j) # JSON => XMRS
> >>> x2.ep(10009).carg
> 'Sherlock'
> >>> d = dmrx.dumps_one(x) # XMRS => XML
> >>> x3 = dmrx.loads_one(d) # XML => XMRS
> >>> x3.ep('10009').carg # nodeid became string from conversion
> 'Sherlock'
>
> I'm not sure what you meant by converting to "string". But I think it's
> your transformations that are causing the lost CARGs. Are you able to say
> what those transformations do?
>
> "My name is Sherlock Holmes.
>>
>> [ TOP: h0
>> RELS: < [ def_explicit_q_rel<0:3> LBL: h1 ARG0: x12 [ x NUM: sg IND: + PERS: 3 ] RSTR: h17 ]
>> [ poss_rel<0:3> LBL: h2 ARG0: e10 [ e TENSE: untensed MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x12 ARG2: x11 [ x NUM: sg PERS: 1 ] ]
>> [ pronoun_q_rel<0:3> LBL: h3 ARG0: x11 RSTR: h18 ]
>> [ pron_rel<0:3> LBL: h4 ARG0: x11 ]
>> [ _name_n_of_rel<4:8> LBL: h2 ARG0: x12 ]
>> [ _be_v_id_rel<9:11> LBL: h5 ARG0: e13 [ e TENSE: pres MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x12 ARG2: x16 [ x NUM: sg IND: + PERS: 3 ] ]
>> [ proper_q_rel<12:28> LBL: h6 ARG0: x16 RSTR: h19 ]
>> [ compound_rel<12:28> LBL: h7 ARG0: e14 [ e TENSE: untensed MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x16 ARG2: x15 [ x NUM: sg IND: + PERS: 3 ] ]
>> [ proper_q_rel<12:20> LBL: h8 ARG0: x15 RSTR: h20 ] [ named_rel<12:20> LBL: h9 ARG0: x15 ]
>> [ named_rel<21:28> LBL: h7 ARG0: x16 ] >
>> HCONS: < h0 qeq h5 h17 qeq h2 h18 qeq h4 h19 qeq h7 h20 qeq h9 > ]
>>
>>
>> On 27 July 2017 at 12:39, Michael Wayne Goodman <goodmami at uw.edu> wrote:
>>
>>> On Wed, Jul 26, 2017 at 6:42 PM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>>> wrote:
>>>
>>>> Hi Mike,
>>>>
>>>> Thanks for answering :D. I think I misunderstood STRINGPRED flag in
>>>> pyDelphin. I always think that there are 2 types of predicates, namely
>>>> GRAMMARPRED and REALPRED. However I have encountered two different types of
>>>> GRAMMARPRED, the one without cargs (such as pronoun_q_rel) and the one with
>>>> carg (such as named_rel). I thought STRINGPRED was used to represent the
>>>> GRAMMARPRED with CARG.
>>>>
>>>
>>> Ah I see. Unfortunately there is no predicate subcategorization that
>>> indicates the presence of a constant argument. But, fortunately, there is
>>> only one argument role for constant arguments (CARG), and nobody that I'm
>>> aware of has chosen to rename it or add additional ones. The LKB and PET
>>> allowed it to be customized via a *value-feats* parameter, but ACE just
>>> assumes it will be "CARG", and nobody's complained about that yet. So I
>>> think it's safe to assume it will be called "CARG". This assumption makes
>>> the following easier...
>>>
>>> Right now I'm transforming DMRS using pyDelphin and I'm looking for a
>>>> way to find GRAMMARPRED with CARG because after each transformation
>>>> pyDelphin take my CARGS out and hide them some where. I need the CARGS for
>>>> other mapping though. Is there a way to accomplish this?
>>>>
>>>
>>> When working with Xmrs structures in PyDelphin, just look for "CARG" on
>>> an EP's arguments. E.g. for some Xmrs object x that represents the sentence
>>> "Abrams slept." where the named_rel EP has nodeid 10001:
>>>
>>> >>> x.ep(10001).args.get('CARG')
>>> 'Abrams'
>>>
>>> For convenience, I also define a "carg" property on EPs:
>>>
>>> >>> x.ep(10001).carg
>>> 'Abrams'
>>>
>>> These are essentially equivalent. If you just want to test for the
>>> presence of the CARG, you can do this:
>>>
>>> >>> 'CARG' in x.ep(10001).args
>>> True
>>> >>> 'CARG' in x.ep(10000).args # 10000 is for proper_q
>>> False
>>>
>>> In DMRS, nodes do not contain information about the arguments the node
>>> participates in; this information is stored in the links. However, CARGs
>>> are special in that they become node attributes instead of links:
>>>
>>> >>> from delphin.mrs.components import nodes
>>> >>> ns = {n.nodeid: n for n in nodes(x)}
>>> >>> ns[10001].carg
>>> 'Abrams'
>>>
>>> If you try to recreate an Xmrs from Dmrs nodes and don't give it any
>>> links, you will only get intrinsic arguments (ARG0s) and CARGs, and any
>>> other arguments will be lost:
>>>
>>> >>> from delphin.mrs.xmrs import Dmrs
>>> >>> d = Dmrs(nodes=ns.values())
>>> >>> d.ep(10001).args
>>> {'CARG': 'Abrams', 'ARG0': 'x5'}
>>> >>> d.ep(10002).args # 10002 is for "slept"
>>> {'ARG0': 'e6'}
>>>
>>> If you also give it the links, it should be able to recreate the
>>> original (x)mrs.
>>>
>>> So I'm not sure what kind of transformation you are doing would cause
>>> the CARG to be lost. Can you elaborate or provide a MWE (minimal working
>>> example) that shows when the CARG is lost/hidden?
>>>
>>>
>>>>
>>>> On 27 July 2017 at 03:45, Michael Wayne Goodman <goodmami at uw.edu>
>>>> wrote:
>>>>
>>>>> Hi Tuan Anh,
>>>>>
>>>>> The thinking of subtypes of predicates has changed over the years, and
>>>>> as a result there's numerous overlapping terms for various things. I
>>>>> believe that the current consensus (documented on
>>>>> http://moin.delph-in.net/PredicateRfc) is that there are two main
>>>>> axes of subcategorization for predicate symbols: abstract vs surface and
>>>>> string vs type (with "real" predicates being a decomposed form of surface
>>>>> preds). But I suspect what you mean by STRINGPRED is not the same thing as
>>>>> described on the wiki?
>>>>>
>>>>> I'm afraid PyDelphin is a bit behind the times WRT these definitions
>>>>> of predicates, and instead follows mostly what was described in the MRS DTD
>>>>> (as I understood it at the time). Therefore currently PyDelphin calls
>>>>> predicates beginning with an underscore a "stringpred", regardless of
>>>>> whether the symbol was quoted or not. I.e., it should be called
>>>>> "surfacepred" or something. Abstract predicates are called "grammarpred" in
>>>>> PyDelphin. I have created a ticket for this bug:
>>>>> https://github.com/delph-in/pydelphin/issues/117.
>>>>>
>>>>> Now back to your question, "named_rel" is called a GRAMMARPRED in
>>>>> PyDelphin because the predicate does not start with an underscore.
>>>>>
>>>>> On Wed, Jul 26, 2017 at 8:25 AM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi delphinians,
>>>>>>
>>>>>> I'm working on predicate mapping and found that named_rel are not
>>>>>> STRINGPRED when processed by pyDelphin.Are these expected behaviours? Can
>>>>>> someone please shed some light for me on this matter :D Thank you.
>>>>>>
>>>>>> REALPRED 1
>>>>>> STRINGPRED 2
>>>>>> <ElementaryPredication object (pron_rel (x24)) at 140241381023560> 0
>>>>>> <ElementaryPredication object (pronoun_q_rel (x24)) at
>>>>>> 140241380610120> 0
>>>>>> <ElementaryPredication object (_have_v_1_rel (e25)) at
>>>>>> 140241380610240> 1
>>>>>> <ElementaryPredication object (_the_q_rel (x26)) at 140241380610360> 1
>>>>>> <ElementaryPredication object (_pleasure_n_of_rel (x26)) at
>>>>>> 140241380610480> 1
>>>>>> <ElementaryPredication object (udef_q_rel (x27)) at 140241380610600> 0
>>>>>> <ElementaryPredication object (nominalization_rel (x27)) at
>>>>>> 140241380610720> 0
>>>>>> <ElementaryPredication object (_make_v_1_rel (e28)) at
>>>>>> 140241380610840> 1
>>>>>> <ElementaryPredication object (_the_q_rel (x29)) at 140241380610960> 1
>>>>>> <ElementaryPredication object (_doctor_n_1_rel (x29)) at
>>>>>> 140241380611080> 1
>>>>>> <ElementaryPredication object (def_explicit_q_rel (x31)) at
>>>>>> 140241380611200> 0
>>>>>> <ElementaryPredication object (poss_rel (e30)) at 140241380611320> 0
>>>>>> <ElementaryPredication object (_acquaintance_n_1_rel (x31)) at
>>>>>> 140241380611440> 1
>>>>>> <ElementaryPredication object (_say_v_to_rel (e32)) at
>>>>>> 140241380611560> 1
>>>>>> <ElementaryPredication object (proper_q_rel (x33)) at
>>>>>> 140241380611680> 0
>>>>>> *<ElementaryPredication object (named_rel (x33)) at 140241380611800>
>>>>>> 0*
>>>>>> <ElementaryPredication object (_and_c_rel (e34)) at 140241380611920> 1
>>>>>> <ElementaryPredication object (_in_p_rel (e35)) at 140241380612040> 1
>>>>>> <ElementaryPredication object (udef_q_rel (x37)) at 140241380612160> 0
>>>>>> <ElementaryPredication object (_a+few_a_1_rel (e36)) at
>>>>>> 140241380612280> 1
>>>>>> <ElementaryPredication object (_word_n_of_rel (x37)) at
>>>>>> 140241380612400> 1
>>>>>> <ElementaryPredication object (pron_rel (x38)) at 140241380612520> 0
>>>>>> <ElementaryPredication object (pronoun_q_rel (x38)) at
>>>>>> 140241380612640> 0
>>>>>> <ElementaryPredication object (_sketch_v_1_rel (e39)) at
>>>>>> 140241380612760> 1
>>>>>> <ElementaryPredication object (_out_p_rel (e40)) at 140241380612880> 1
>>>>>> <ElementaryPredication object (free_relative_q_rel (x41)) at
>>>>>> 140241380613000> 0
>>>>>> <ElementaryPredication object (thing_rel (x41)) at 140241380613120> 0
>>>>>> <ElementaryPredication object (_occur_v_to_rel (e42)) at
>>>>>> 140241380613240> 1
>>>>>>
>>>>>>
>>>>>> Yours,
>>>>>> --
>>>>>> Tuan Anh Le
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Michael Wayne Goodman
>>>>> Ph.D. Candidate, UW Linguistics
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Yours,
>>>> --
>>>> Tuan Anh Le
>>>>
>>>
>>>
>>>
>>> --
>>> Michael Wayne Goodman
>>> Ph.D. Candidate, UW Linguistics
>>>
>>
>>
>>
>> --
>> Yours,
>> --
>> Tuan Anh Le
>>
>
>
>
> --
> Michael Wayne Goodman
> Ph.D. Candidate, UW Linguistics
>
--
Yours,
--
Tuan Anh Le
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170728/9d91002d/attachment-0001.html>
More information about the developers
mailing list