[developers] Re: grammar locale issues
Stephan Oepen
oe at csli.Stanford.EDU
Tue Jun 21 21:03:30 CEST 2005
hi again, ben,
> Do you have any comments on the proposals below?
yes, plenty :-). but not really much time to follow-up on these right
now.
i am sympathetic to your proposal to build more checks and balances in,
specifically make the `default' encoding for each grammar explicit as a
global variable (we have long had that property in PET). i thought the
*grmmar-encoding* global, plus read-script-file-aux() doing some sanity
checking was a promising idea.
also, i had been planning to toggle
(defparameter cdb::*cdb-ascii-p* nil)
since the current LKB `dot.emacs' defaults to UTF-8 now, and i agree to
your expectations that ASCII grammar will not break by writing two-byte
CDB entries (i was hoping to test that assumption, though :-).
finally, i am nervous about the set-up you propose where we try to make
ELI communication always be UTF-8, but potentially have another coding
convention for the grammar files (or i/o with sub-processes). this is
a new idea to me (i did not think it would be possible), but in general
my experience has been that ensuring _one_ consistent coding system at
all levels is the path to happiness: i believe your proposal could mean
that the *common-lisp* buffer has a different (process) coding system
than buffers visiting TDL files (e.g. for JaCY, where files continue to
be in EUC, for now). i fear that will be harder to set up reliably for
emacs(1) than just one consistent scheme and create potential for user
confusion (and i have seen difficulties pasting in X across encodings).
i am not convinced this level of sophistication is really needed. some
of the currently documented procedures are more complex than i think is
required (today). for example, the following just works for me (modulo
substitution of $DELPHINHOME, of course):
emacs -q &
M-x load-file RET $DELPHINHOME/lkb/etc/dot.emacs RET
M-x japanese RET
(read-script-file-aux "$DELPHINHOME/japanese/lkb/ascript")
(do-parse-tty "食べた")
--- melanie will be visiting here in july, and francis and i expect to
streamline set-up for JaCY during her visit.
somewhat more high-level, i am inclined to encourage more people to use
UTF-8, but in western europe and japan, at least, there appears to be a
strong, established non-UniCode tradition :-{.
so much for tonight; hth - oe
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (ILN); Boks 1102 Blindern; 0317 Oslo; (+47) 2285 7989
+++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++ --- oe at csli.stanford.edu; oe at hf.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
More information about the developers
mailing list