[developers] STRINGPRED in MRS

Michael Wayne Goodman goodmami at uw.edu
Thu Jul 27 20:10:46 CEST 2017


Forgive me, Tuan Anh, for bringing the discussion back to the list...

On Thu, Jul 27, 2017 at 9:25 AM, Tuấn Anh Lê <tuananh.ke at gmail.com> wrote:

> Hi Mike, thank so much. I'm still debugging to see why the CARGS
> disappeared. I'm supporting different formats now and in the middle of
> converting back and forth something weird happened. I suspect it's my code
> that caused the problem. My parsing flow right now is pyDelphin/ACE => XMRS
> => XML/JSON/string (for manual editing).
>
> If I did lose the CARGS, is it possible to add them back in automatically
> given that I have this MRS and the sentence text or I have to write code to
> do that?
>

Unless I see your transformations I can't really pinpoint where CARGS is
being lost. So instead I'll confirm that PyDelphin doesn't drop them during
the conversions you mentioned:

    >>> from delphin.interfaces import ace
    >>> from delphin.mrs import xmrs, dmrx
    >>> r = ace.parse(
    ...   '/home/goodmami/grammars/erg-1214/erg-1214.dat',
    ...   'My name is Sherlock Holmes.')
    NOTE: parsed 1 / 1 sentences, avg 3668k, time 0.12449s
    >>> x = r.result(0).mrs()     # ACE => XMRS
    >>> j = xmrs.Dmrs.to_dict(x)  # XMRS => JSON
    >>> x2 = xmrs.Dmrs.from_dict(j)  # JSON => XMRS
    >>> x2.ep(10009).carg
    'Sherlock'
    >>> d = dmrx.dumps_one(x)     # XMRS => XML
    >>> x3 = dmrx.loads_one(d)    # XML => XMRS
    >>> x3.ep('10009').carg       # nodeid became string from conversion
    'Sherlock'

I'm not sure what you meant by converting to "string". But I think it's
your transformations that are causing the lost CARGs. Are you able to say
what those transformations do?

"My name is Sherlock Holmes.
>
> [ TOP: h0
>   RELS: < [ def_explicit_q_rel<0:3> LBL: h1 ARG0: x12 [ x NUM: sg IND: + PERS: 3 ] RSTR: h17 ]
>           [ poss_rel<0:3> LBL: h2 ARG0: e10 [ e TENSE: untensed MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x12 ARG2: x11 [ x NUM: sg PERS: 1 ] ]
>           [ pronoun_q_rel<0:3> LBL: h3 ARG0: x11 RSTR: h18 ]
>           [ pron_rel<0:3> LBL: h4 ARG0: x11 ]
>           [ _name_n_of_rel<4:8> LBL: h2 ARG0: x12 ]
>           [ _be_v_id_rel<9:11> LBL: h5 ARG0: e13 [ e TENSE: pres MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x12 ARG2: x16 [ x NUM: sg IND: + PERS: 3 ] ]
>           [ proper_q_rel<12:28> LBL: h6 ARG0: x16 RSTR: h19 ]
>           [ compound_rel<12:28> LBL: h7 ARG0: e14 [ e TENSE: untensed MOOD: indicative PERF: - SF: prop PROG: - ] ARG1: x16 ARG2: x15 [ x NUM: sg IND: + PERS: 3 ] ]
>           [ proper_q_rel<12:20> LBL: h8 ARG0: x15 RSTR: h20 ]          [ named_rel<12:20> LBL: h9 ARG0: x15 ]
>           [ named_rel<21:28> LBL: h7 ARG0: x16 ] >
>   HCONS: < h0 qeq h5 h17 qeq h2 h18 qeq h4 h19 qeq h7 h20 qeq h9 > ]
>
>
> On 27 July 2017 at 12:39, Michael Wayne Goodman <goodmami at uw.edu> wrote:
>
>> On Wed, Jul 26, 2017 at 6:42 PM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>> wrote:
>>
>>> Hi Mike,
>>>
>>> Thanks for answering :D. I think I misunderstood STRINGPRED flag in
>>> pyDelphin. I always think that there are 2 types of predicates, namely
>>> GRAMMARPRED and REALPRED. However I have encountered two different types of
>>> GRAMMARPRED, the one without cargs (such as pronoun_q_rel) and the one with
>>> carg (such as named_rel). I thought STRINGPRED was used to represent the
>>> GRAMMARPRED with CARG.
>>>
>>
>> Ah I see. Unfortunately there is no predicate subcategorization that
>> indicates the presence of a constant argument. But, fortunately, there is
>> only one argument role for constant arguments (CARG), and nobody that I'm
>> aware of has chosen to rename it or add additional ones. The LKB and PET
>> allowed it to be customized via a *value-feats* parameter, but ACE just
>> assumes it will be "CARG", and nobody's complained about that yet. So I
>> think it's safe to assume it will be called "CARG". This assumption makes
>> the following easier...
>>
>> Right now I'm transforming DMRS using pyDelphin and I'm looking for a way
>>> to find GRAMMARPRED with CARG because after each transformation pyDelphin
>>> take my CARGS out and hide them some where. I need the CARGS for other
>>> mapping though. Is there a way to accomplish this?
>>>
>>
>> When working with Xmrs structures in PyDelphin, just look for "CARG" on
>> an EP's arguments. E.g. for some Xmrs object x that represents the sentence
>> "Abrams slept." where the named_rel EP has nodeid 10001:
>>
>> >>> x.ep(10001).args.get('CARG')
>> 'Abrams'
>>
>> For convenience, I also define a "carg" property on EPs:
>>
>> >>> x.ep(10001).carg
>> 'Abrams'
>>
>> These are essentially equivalent. If you just want to test for the
>> presence of the CARG, you can do this:
>>
>> >>> 'CARG' in x.ep(10001).args
>> True
>> >>> 'CARG' in x.ep(10000).args  # 10000 is for proper_q
>> False
>>
>> In DMRS, nodes do not contain information about the arguments the node
>> participates in; this information is stored in the links. However, CARGs
>> are special in that they become node attributes instead of links:
>>
>> >>> from delphin.mrs.components import nodes
>> >>> ns = {n.nodeid: n for n in nodes(x)}
>> >>> ns[10001].carg
>> 'Abrams'
>>
>> If you try to recreate an Xmrs from Dmrs nodes and don't give it any
>> links, you will only get intrinsic arguments (ARG0s) and CARGs, and any
>> other arguments will be lost:
>>
>> >>> from delphin.mrs.xmrs import Dmrs
>> >>> d = Dmrs(nodes=ns.values())
>> >>> d.ep(10001).args
>> {'CARG': 'Abrams', 'ARG0': 'x5'}
>> >>> d.ep(10002).args  # 10002 is for "slept"
>> {'ARG0': 'e6'}
>>
>> If you also give it the links, it should be able to recreate the original
>> (x)mrs.
>>
>> So I'm not sure what kind of transformation you are doing would cause the
>> CARG to be lost. Can you elaborate or provide a MWE (minimal working
>> example) that shows when the CARG is lost/hidden?
>>
>>
>>>
>>> On 27 July 2017 at 03:45, Michael Wayne Goodman <goodmami at uw.edu> wrote:
>>>
>>>> Hi Tuan Anh,
>>>>
>>>> The thinking of subtypes of predicates has changed over the years, and
>>>> as a result there's numerous overlapping terms for various things. I
>>>> believe that the current consensus (documented on
>>>> http://moin.delph-in.net/PredicateRfc) is that there are two main axes
>>>> of subcategorization for predicate symbols: abstract vs surface and string
>>>> vs type (with "real" predicates being a decomposed form of surface preds).
>>>> But I suspect what you mean by STRINGPRED is not the same thing as
>>>> described on the wiki?
>>>>
>>>> I'm afraid PyDelphin is a bit behind the times WRT these definitions of
>>>> predicates, and instead follows mostly what was described in the MRS DTD
>>>> (as I understood it at the time). Therefore currently PyDelphin calls
>>>> predicates beginning with an underscore a "stringpred", regardless of
>>>> whether the symbol was quoted or not. I.e., it should be called
>>>> "surfacepred" or something. Abstract predicates are called "grammarpred" in
>>>> PyDelphin. I have created a ticket for this bug:
>>>> https://github.com/delph-in/pydelphin/issues/117.
>>>>
>>>> Now back to your question, "named_rel" is called a GRAMMARPRED in
>>>> PyDelphin because the predicate does not start with an underscore.
>>>>
>>>> On Wed, Jul 26, 2017 at 8:25 AM, Tuấn Anh Lê <tuananh.ke at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi delphinians,
>>>>>
>>>>> I'm working on predicate mapping and found that named_rel are not
>>>>> STRINGPRED when processed by pyDelphin.Are these expected behaviours? Can
>>>>> someone please shed some light for me on this matter :D Thank you.
>>>>>
>>>>> REALPRED 1
>>>>> STRINGPRED 2
>>>>> <ElementaryPredication object (pron_rel (x24)) at 140241381023560> 0
>>>>> <ElementaryPredication object (pronoun_q_rel (x24)) at
>>>>> 140241380610120> 0
>>>>> <ElementaryPredication object (_have_v_1_rel (e25)) at
>>>>> 140241380610240> 1
>>>>> <ElementaryPredication object (_the_q_rel (x26)) at 140241380610360> 1
>>>>> <ElementaryPredication object (_pleasure_n_of_rel (x26)) at
>>>>> 140241380610480> 1
>>>>> <ElementaryPredication object (udef_q_rel (x27)) at 140241380610600> 0
>>>>> <ElementaryPredication object (nominalization_rel (x27)) at
>>>>> 140241380610720> 0
>>>>> <ElementaryPredication object (_make_v_1_rel (e28)) at
>>>>> 140241380610840> 1
>>>>> <ElementaryPredication object (_the_q_rel (x29)) at 140241380610960> 1
>>>>> <ElementaryPredication object (_doctor_n_1_rel (x29)) at
>>>>> 140241380611080> 1
>>>>> <ElementaryPredication object (def_explicit_q_rel (x31)) at
>>>>> 140241380611200> 0
>>>>> <ElementaryPredication object (poss_rel (e30)) at 140241380611320> 0
>>>>> <ElementaryPredication object (_acquaintance_n_1_rel (x31)) at
>>>>> 140241380611440> 1
>>>>> <ElementaryPredication object (_say_v_to_rel (e32)) at
>>>>> 140241380611560> 1
>>>>> <ElementaryPredication object (proper_q_rel (x33)) at 140241380611680>
>>>>> 0
>>>>> *<ElementaryPredication object (named_rel (x33)) at 140241380611800> 0*
>>>>> <ElementaryPredication object (_and_c_rel (e34)) at 140241380611920> 1
>>>>> <ElementaryPredication object (_in_p_rel (e35)) at 140241380612040> 1
>>>>> <ElementaryPredication object (udef_q_rel (x37)) at 140241380612160> 0
>>>>> <ElementaryPredication object (_a+few_a_1_rel (e36)) at
>>>>> 140241380612280> 1
>>>>> <ElementaryPredication object (_word_n_of_rel (x37)) at
>>>>> 140241380612400> 1
>>>>> <ElementaryPredication object (pron_rel (x38)) at 140241380612520> 0
>>>>> <ElementaryPredication object (pronoun_q_rel (x38)) at
>>>>> 140241380612640> 0
>>>>> <ElementaryPredication object (_sketch_v_1_rel (e39)) at
>>>>> 140241380612760> 1
>>>>> <ElementaryPredication object (_out_p_rel (e40)) at 140241380612880> 1
>>>>> <ElementaryPredication object (free_relative_q_rel (x41)) at
>>>>> 140241380613000> 0
>>>>> <ElementaryPredication object (thing_rel (x41)) at 140241380613120> 0
>>>>> <ElementaryPredication object (_occur_v_to_rel (e42)) at
>>>>> 140241380613240> 1
>>>>>
>>>>>
>>>>> Yours,
>>>>> --
>>>>> Tuan Anh Le
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael Wayne Goodman
>>>> Ph.D. Candidate, UW Linguistics
>>>>
>>>
>>>
>>>
>>> --
>>> Yours,
>>> --
>>> Tuan Anh Le
>>>
>>
>>
>>
>> --
>> Michael Wayne Goodman
>> Ph.D. Candidate, UW Linguistics
>>
>
>
>
> --
> Yours,
> --
> Tuan Anh Le
>



-- 
Michael Wayne Goodman
Ph.D. Candidate, UW Linguistics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170727/281ccd41/attachment-0001.html>


More information about the developers mailing list