[developers] Regular Expressions in Transfer Rules

Woodley Packard sweaglesw at sweaglesw.org
Wed Mar 1 08:20:39 CET 2017


Hi Francis,

I was able to translate "猫 は 吠える ." into "The cat barks." (and other variants) successfully, and with debugging output enabled (debug_transfer = 1 in transfer.c), noted the rule in question successfully firing for a count_noun_mark.

I also tried adding a new test_mark to the MRS to be transferred.  This got rewritten as "ja:test_mark" and did *not* end up matching the rule.  My hypothesis is that this is due to the final quote.  ACE’s transfer implementation does not (yet) fully implement the brave new world under which quotes and _rel suffixes are irrelevant to predicate matching, and when it does you will probably need to enable it with a configuration option.  A possible workaround would be to write "~mark\"?$".  Incidentally, how do you feel about "~mark$" matching "_mark_rel"?  To me, it would seem hard to justify ignoring quotes without also ignoring the _rel suffix.  I would be curious to know the current LOGON implementation’s stance on this.

If you don’t think the behavior you are observing matches the situation above, could you provide an example that causes the behavior so that I can reproduce it locally?

Regards,
Woodley

> On Feb 28, 2017, at 4:38 PM, Francis Bond <bond at ieee.org> wrote:
> 
> G'day,
> 
> Mike and I noticed (after a hint from Matic) that one of the rules in jaen did not fire in ACE in the same way it did with LOGON.
> 
> in jaen/erg.tdl there is a rule to get rid of any left over marks:
> 
> all_mark_ditch_ef := elision_mtr &
> [ INPUT.RELS < [ PRED "~mark$" ] > ].
> 
> This should match any predicate ending with "mark" --- but does not.   When we rewrote it to: 
> 
> all_mark_ditch_ef := elision_mtr &
> [ INPUT.RELS < [ PRED "~_mark" ] > ].
> 
> it matched most of the things we want, but will over match (including, e.g.,  "_mark_n_1").
> 
> Is this a principled difference in the regular expression handling?  Should we be writing it in another way?
> 
> Yours, 
> 
> -- 
> Francis Bond <http://www3.ntu.edu.sg/home/fcbond/ <http://www3.ntu.edu.sg/home/fcbond/>>
> Division of Linguistics and Multilingual Studies
> Nanyang Technological University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.delph-in.net/archives/developers/attachments/20170228/01d49932/attachment.html>


More information about the developers mailing list