[itsdb] [lkb] the fine system and unicode
Ben Waldron
benjamin.waldron at cl.cam.ac.uk
Mon Feb 13 12:48:33 CET 2006
Stephan Oepen wrote:
>getting UniCode to work in [incr tsdb()] is not much of a problem. you
>should make sure that
>
> (a) your [incr tsdb()] data files (skeletons or ASCII import files)
> are all coded in UTF-8.
>
You can use the 'file' command under Linux to check the encoding of files:
bmw20 at bmw-1:~/erg> file irregs.tab
irregs.tab: UTF-8 Unicode text
> (b) the Lisp universe running [incr tsdb()] uses a UTF-8 locale; try
> evaluating excl:*locale* to check, and then maybe use the -locale
> command line option to the underlying Lisp image (ACL appears to
> not choose its initial locale based on the LANG shell variable).
>
An alternative to explicitly setting -locale when starting the Lisp
image is to set the coding system as a property of the grammar files.
E.g. you can place the following in GRAMMAR/lkb/globals:
(when (lkb-version-after-p "2006/02/08 15:00:00")
(set-coding-system utf-8))
OR if your LKB image is old (and you are running Allegro CL):
(setf excl:*locale* (excl::find-locale ".utf8"))
- Ben
More information about the itsdb
mailing list