[developers] Comparing a profile with a grammar output

Sat Aug 22 17:57:53 CEST 2020

Hi Stephan,

Following your directions, I asked pydelphin to export with line breaks (—indent), and I successfully execute the mtool with all except the ‘mrp’ metric, see below.

For the profiles, you can find them at https://github.com/arademaker/sick-fftb. Thank you so much for your help.

You raised an interesting question about the `item identifier`. Is it part of the EDS? We may need a specification of a file format containing a sequence of EDS serialization (or native EDS syntax, as you also wrote). I think the same happens with ACE stdout protocols (https://pydelphin.readthedocs.io/en/latest/api/delphin.ace.html#ace-stdout-protocols), for instance, a "SENT: ..." precedes all MRSs, but this is not part of the MRS. These issues are all related to the work in the RDF Schemas… 

Best,
Alexandre

echo "It is rainning today." | ace -g ../wn/terg-mac.dat -T -n 1 | delphin convert --indent --from ace --to eds > 1.eds
echo "It is rainning today." | ace -g ../wn/terg-mac.dat -T -n 1 | delphin convert --indent --from ace --to eds > 2.eds

% ./main.py --read eds --score ucca --gold ../sick/1.eds ../sick/2.eds
{"n": 1,
 "labeled": {"primary": {"g": 6, "s": 6, "c": 6, "p": 1.0, "r": 1.0, "f": 1.0}, "remote": {"g": 0, "s": 0, "c": 0, "p": 0.0, "r": 0.0, "f": 0.0}},
 "unlabeled": {"primary": {"g": 5, "s": 5, "c": 5, "p": 1.0, "r": 1.0, "f": 1.0}, "remote": {"g": 0, "s": 0, "c": 0, "p": 0.0, "r": 0.0, "f": 0.0}},
 "time": 0.0001461505889892578,
 "cpu": 0.0004350000000000742}

% ./main.py --read eds --score smatch --gold ../sick/1.eds ../sick/2.eds
{"n": 1,
 "g": 42,
 "s": 42,
 "c": 42,
 "p": 1.0,
 "r": 1.0,
 "f": 1.0,
 "time": 0.0055389404296875,
 "cpu": 0.016556000000000015}

% ./main.py --read eds --score edm --gold ../sick/1.eds ../sick/2.eds
{"n": 1,
 "names": {"g": 7, "s": 7, "c": 7, "p": 1.0, "r": 1.0, "f": 1.0},
 "arguments": {"g": 6, "s": 6, "c": 6, "p": 1.0, "r": 1.0, "f": 1.0},
 "tops": {"g": 1, "s": 1, "c": 1, "p": 1.0, "r": 1.0, "f": 1.0},
 "properties": {"g": 21, "s": 21, "c": 21, "p": 1.0, "r": 1.0, "f": 1.0},
 "all": {"g": 35, "s": 35, "c": 35, "p": 1.0, "r": 1.0, "f": 1.0},
 "time": 8.106231689453125e-05,
 "cpu": 0.00024100000000004673}

% ./main.py --read eds --score sdp --gold ../sick/1.eds ../sick/2.eds
{"n": 1,
 "labeled": {"g": 7, "s": 7, "c": 7, "p": 1.0, "r": 1.0, "f": 1.0, "m": 1.0},
 "unlabeled": {"g": 6, "s": 6, "c": 6, "p": 1.0, "r": 1.0, "f": 1.0, "m": 1.0},
 "time": 7.104873657226562e-05,
 "cpu": 0.00021099999999996122}

% ./main.py --read eds --score mrp --gold ../sick/1.eds ../sick/2.eds
Traceback (most recent call last):
  File "./main.py", line 472, in <module>
    main();
  File "./main.py", line 385, in main
    result = score.mces.evaluate(gold, graphs,
  File "/Users/ar/hpsg/mtool/score/mces.py", line 493, in evaluate
    for id, g, s, tops, labels, properties, anchors, \
  File "/Users/ar/hpsg/mtool/score/mces.py", line 490, in <genexpr>
    results = (schedule(g, s, rrhc_limit, mces_limit, trace, errors)
  File "/Users/ar/hpsg/mtool/score/mces.py", line 441, in schedule
    raise e;
  File "/Users/ar/hpsg/mtool/score/mces.py", line 389, in schedule
    = g.score(s, mapping);
  File "/Users/ar/hpsg/mtool/graph.py", line 856, in score
    = tuples(self, identities1);
  File "/Users/ar/hpsg/mtool/graph.py", line 771, in tuples
    anchors.add((identity, anchor));
TypeError: unhashable type: ‘list'

> On 22 Aug 2020, at 04:02, Stephan Oepen <oe at ifi.uio.no> wrote:
> 
> hi again, alexandre and mike:
> 
>> I added the #XXXX right before the EDS serialization. The only different between these files in the https://github.com/cfmrp/mtool/blob/master/data/score/eds/wsj.pet.eds is that these files are not formatted with one predicate per line, instead, the EDS is serialised in a single line without line breaks.
> 
> i am tempted to declare those line breaks a necessary part of the
> native EDS syntax (though i see that the current EdsTop wiki page does
> not explicitly state that).  mike, could you change EDS serialization
> in pyDelphin to reflect the multi-line format exemplified on that
> page?  also, when you have an item identifier available i would
> suggest you prefix the EDS with an additional line (assuming the
> identifier is 4711):
> 
> #4711
> 
> this latter addition should be considered optional, though, and i
> shall check that the mtool EDS reader does not require it (i suspect
> currently it does; mtool has hardly been used in conjunction with
> native EDS serialization, so this is a welcome push toward better
> cross-format and -platform interoperability).
> 
> regarding your lack of success when invoking the scorer in [incr
> tsdb()], alexandre: could you make available to me a copy of the two
> profiles involved?
> 
> best wishes, oe