[developers] RMRS characterization differences
Sergio Roa
sergior at coli.uni-sb.de
Wed Jul 11 21:27:45 CEST 2007
Hello,
I am using cheap to extract RMRSs with characterization information.
I am getting different results using both the svn revision of cheap
and lkb. I will explain the issue with the following example. Parsing
the sentence: "I show my work", I get the following character spans
using cheap:
<rmrs cfrom='-1' cto='-1'>
<label vid='1'/>
<ep cfrom='0' cto='14'><gpred>prop-or-ques_m_rel</gpred><label
vid='1'/><var
sort='e' vid='2'/></ep>
<ep cfrom='0' cto='14'><gpred>pron_rel</gpred><label vid='6'/><var sort='x'
vid='7'/></ep>
<ep cfrom='0' cto='14'><gpred>pronoun_q_rel</gpred><label vid='8'/><var
sort='x' vid='7'/></ep>
<ep cfrom='2' cto='6'><realpred lemma='show' pos='v' sense='1'/><label
vid='11'/><var sort='e' vid='2'/></ep>
<ep cfrom='7' cto='14'><gpred>def_explicit_q_rel</gpred><label
vid='13'/><var
sort='x' vid='12'/></ep>
<ep cfrom='7' cto='14'><gpred>poss_rel</gpred><label vid='16'/><var
sort='e'
vid='18'/></ep>
<ep cfrom='7' cto='14'><gpred>pronoun_q_rel</gpred><label vid='19'/><var
sort='x' vid='17'/></ep>
<ep cfrom='7' cto='14'><gpred>pron_rel</gpred><label vid='22'/><var
sort='x'
vid='17'/></ep>
<ep cfrom='10' cto='14'><realpred lemma='work' pos='n' sense='1'/><label
vid='10001'/><var sort='x' vid='12'/></ep>
[...]
So, the pron_rel, which would correspond to "I", spans the range from
0 to 14. On the contrary, working within lkb I get the following
result:
<rmrs cfrom='-1' cto='-1'>
<label vid='1'/>
<ep cfrom='0' cto='14'><gpred>prop-or-ques_m_rel</gpred><label
vid='1'/><var
sort='e' vid='2'/></ep>
<ep cfrom='0' cto='1'><gpred>pron_rel</gpred><label vid='6'/><var sort='x'
vid='7'/></ep>
<ep cfrom='0' cto='1'><gpred>pronoun_q_rel</gpred><label vid='8'/><var
sort='x' vid='7'/></ep>
<ep cfrom='2' cto='6'><realpred lemma='show' pos='v' sense='1'/><label
vid='11'/><var sort='e' vid='2'/></ep>
<ep cfrom='7' cto='9'><gpred>def_explicit_q_rel</gpred><label
vid='13'/><var
sort='x' vid='12'/></ep>
<ep cfrom='7' cto='9'><gpred>poss_rel</gpred><label vid='16'/><var sort='e'
vid='18'/></ep>
<ep cfrom='7' cto='9'><gpred>pronoun_q_rel</gpred><label vid='19'/><var
sort='x' vid='17'/></ep>
<ep cfrom='7' cto='9'><gpred>pron_rel</gpred><label vid='22'/><var sort='x'
vid='17'/></ep>
<ep cfrom='10' cto='14'><realpred lemma='work' pos='n' sense='1'/><label
vid='10001'/><var sort='x' vid='12'/></ep>
[...]
So, I think the latter is correct, because pron_rel corresponds to
"I". The character spans for the "poss_rel" relation are also
different. Might it be the case that there is a bug in cheap? There is
no difference in the outcomes with relations like "named_rel", for
example parsing "John shows the work", but still there are differences
within the lemma "the".
Thanks,
Sergio.
More information about the developers
mailing list