[developers] cfrom/cto in cheap - bug when packing?

Francis Bond fcbond at gmail.com
Thu Nov 24 05:41:03 CET 2005


G'day,

we think there is a bug in the cheap calculation of cfrom/cto when
packing is turned on.  If we parse the following sentence with a
recent JACY and cheap 0.77 with Eric's patches:
庭 で 食べる
niwa de taberu
garden in eat
"(I) eat in the garden"

The (token based) cfrom value for で de is 1 without packing, which is
what we expect.  It is 0 with -packing=7.  We get analogous results if
we use -tok=xml_counts and a pic.xml input with character positions
specified.

I haven't been able to confirm with any other grammars, as I still
can't get cfrom/cto to work with the ERG.

Bernd: would you be able to have a look at this?

Francis

P.S.  Here are more complete results, I would be happy to send output
of higher verbosity if it would help:

$cheap -mrs=xml -results=1 japanese
庭 で 食べる
(1) `庭 で 食べる' [0] --- 1 (0.01|0.01s) <23:56> (442.4K) [0.0s]
derivation[1] (-1.12):庭 で 食べる

<rmrs cfrom='-1' cto='-1'>
<label vid='1'/>
<ep cfrom='0' cto='3'><gpred>proposition_m_rel</gpred><label
vid='1'/><var sort='e' vid='2'/></ep>
<ep cfrom='0' cto='1'><realpred lemma='niwa' pos='n' sense='1'/><label
vid='4'/><var sort='x' vid='5'/></ep>
<ep cfrom='0' cto='1'><gpred>udef_rel</gpred><label vid='6'/><var
sort='x' vid='5'/></ep>
<ep cfrom='1' cto='2'><realpred lemma='de' pos='p'/><label
vid='9'/><var sort='e' vid='10'/></ep>
<ep cfrom='2' cto='3'><realpred lemma='taberu' pos='v'/><label
vid='11'/><var sort='e'
vid='2'/></ep><rarg><rargname>MARG</rargname><label vid='1'/><var
sort='h' vid='3'/></rarg>
<rarg><rargname>RSTR</rargname><label vid='6'/><var sort='h' vid='7'/></rarg>
<rarg><rargname>BODY</rargname><label vid='6'/><var sort='h' vid='8'/></rarg>
<rarg><rargname>ARG1</rargname><label vid='9'/><var sort='x' vid='5'/></rarg>
<rarg><rargname>ARG2</rargname><label vid='9'/><var sort='e' vid='2'/></rarg>
<hcons hreln='qeq'><hi><var sort='h' vid='3'/></hi><lo><label
vid='11'/></lo></hcons>
<hcons hreln='qeq'><hi><var sort='h' vid='7'/></hi><lo><label
vid='4'/></lo></hcons>
</rmrs>



$ cheap -mrs=xml -results=1 -packing japanese
庭 で 食べる
(2) `庭 で 食べる' [0] --- 1 (0.01|0.01s) <23:56> (483.5K) [0.0s]
derivation[1] (-1.12):庭 で 食べる

<rmrs cfrom='-1' cto='-1'>
<label vid='1'/>
<ep cfrom='0' cto='3'><gpred>proposition_m_rel</gpred><label
vid='1'/><var sort='e' vid='2'/></ep>
<ep cfrom='0' cto='1'><realpred lemma='niwa' pos='n' sense='1'/><label
vid='4'/><var sort='x' vid='5'/></ep>
<ep cfrom='0' cto='1'><gpred>udef_rel</gpred><label vid='6'/><var
sort='x' vid='5'/></ep>
<ep cfrom='0' cto='2'><realpred lemma='de' pos='p'/><label
vid='9'/><var sort='e' vid='10'/></ep>
<ep cfrom='2' cto='3'><realpred lemma='taberu' pos='v'/><label
vid='11'/><var sort='e'
vid='2'/></ep><rarg><rargname>MARG</rargname><label vid='1'/><var
sort='h' vid='3'/></rarg>
<rarg><rargname>RSTR</rargname><label vid='6'/><var sort='h' vid='7'/></rarg>
<rarg><rargname>BODY</rargname><label vid='6'/><var sort='h' vid='8'/></rarg>
<rarg><rargname>ARG1</rargname><label vid='9'/><var sort='x' vid='5'/></rarg>
<rarg><rargname>ARG2</rargname><label vid='9'/><var sort='e' vid='2'/></rarg>
<hcons hreln='qeq'><hi><var sort='h' vid='3'/></hi><lo><label
vid='11'/></lo></hcons>
<hcons hreln='qeq'><hi><var sort='h' vid='7'/></hi><lo><label
vid='4'/></lo></hcons>
</rmrs>

--
Francis Bond  <www.kecl.ntt.co.jp/icl/mtg/members/bond/>
NTT Communication Science Laboratories | Machine Translation Research Group



More information about the developers mailing list