[developers] orthographemic processing in PET

Stephan Oepen oe at csli.Stanford.EDU
Thu Nov 3 15:25:12 CET 2005


hi bernd (and others),

in response to my post-lisboa summary, dan reminded me of another issue
in orthographemic processing, viz. the move to treating irregular forms
as mere string-level alternations, thus allowing recursive segmentation
in chains combining regular and irregular orthographemics, e.g.

  sleep (sleep (past_verb_infl_rule slept) (punct_period_rule slept.))

i just checked in change # 877 (against the `oe' branch of PET) to fix
the above problem and the one i had actually mentioned:

>   - syntax for orthographemic rules: the recent LKB changes resulted in
>     cleaning up the syntax for %letter-set, %suffix, et al. annotations
>     on lexical rules.  only #\!, #\?, #\*, and #\) need escaping in the
>     new universe, and #\\ is the escape character.  i believe only the
>     ERG makes use of funny characters in orthographemics currently, but
>     still i think we should update PET to reflect the above.  i all but
>     promised dan to do this, hence will hopefully look into it soon.

i believe both changes to the morphology files should be propagated to
the `main' branch (i suspect at least berthold wants this).  this has
been in use for one ERG release cycle now, so should work okay :-).  i
hope P4 will manage to migrate this change across branches smoothly: it
should be sufficient to do the `morph.ccp' part.

the other change (derivation printing) is something i did while francis
was here, and i remember him submitting a patch already.

                                                       all best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (ILN); Boks 1102 Blindern; 0317 Oslo; (+47) 2285 7989
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at csli.stanford.edu; oe at hf.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 877.p4
Type: application/octet-stream
Size: 11980 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20051103/f2e60741/attachment.obj>


More information about the developers mailing list