[Fwd: Re: [developers] Re: still problem with accents...]

Montserrat Marimon montserrat.marimon at upf.edu
Tue Jun 21 09:28:42 CEST 2005


Good morning Stephan,

>can you test two more things for me, please?  when you run the LKB now,
>please evaluate
>
>  excl:*locale*
>
>in the *common-lisp* buffer and email me the output.  ideally, i would
>like to see the entire *common-lisp* buffer, so as to know exactly what
>has happened since start-up.  as your grammar is coded ISO-9958-1 (aka
>Latin-1 or Western European), i believe the correct setting should be:
>
>  LKB(8): excl:*locale*
>  #<locale "es_ES" [:LATIN1-BASE] @ #x448563ea>
>  
>
This is what I get (In fact I had another locale for catalan and 
Spanish, to test what you had done, I change it everywhere)

Starting image `/home/montse/delphin/lkb/linux.x86.32/lkb'
  with image (dxl) file `/home/montse/delphin/lkb/linux.x86.32/lkb.dxl'
  with arguments `(-locale es_ES.iso88591)'
  in directory `/home/montse/delphin'
  on machine `localhost'.

International Allegro CL Enterprise Edition
7.0 [Linux (x86)] (Jun 5, 2005 6:29)
Copyright (C) 1985-2004, Franz Inc., Oakland, CA, USA.  All Rights Reserved.

This standard runtime copy of Allegro CL was built by:
   [TC7227] Stanford University

; Foreign loading libpq.so.3.

[changing package from "COMMON-LISP-USER" to "LKB"]
LKB(1): ;; Setting (stream-external-format *terminal-io*) to :emacs-mule.
LKB(2): excl:*locale*
#<locale "es_ES" [:LATIN1-BASE] @ #x474cdfba>
LKB(3):


>secondly, i can load and run your grammar, once i make sure my encoding
>is consistenly ISO-8859-1.  i attach an extended `dot.emacs', and you
>could please try the following:
>
>  - save the attachment somewhere, say `/tmp';
>  - M-x load-file RET /tmp/dot.emacs RET
>  - M-x spanish RET
>
>and then load your grammar.  this works for me, insofar as i can now
>
>  (do-parse-tty "el niño lloró")
>
>this yields no unknown word complaints, but unfortunately no parses.
>inspecting `Parse | Show parse chart', i see all three words in the
>chart, accented characters display properly, and there are lexical
>entries associated to them as expected (two for `el', one for `niño'
>and `lloró', respectively).  however, i fail to build the NP, as it
>seems `niño' is not going through `masc-sing-nom_infl_rule'.  after some poking around, i worked out that the line
> 
>
>  inf-verb_infl_rule := inf-infl-rule.
>
>was confusing the reader for %suffix annotations, and once i change it
>to 
>
>  inf-verb_infl_rule := 
>  inf-infl-rule.
>  
>
what!?! Furthermore, this rule is not used...

>i can now parse fine.  
>
Strange... I can't even parse "lloró"...

(do-parse-tty "lloró")

Word `LLOR' is not in lexicon.
No parses found
0
0
0
0
[1] LKB(6):

>i think i will include the `dot.emacs' changes
>in the next LKB build, so feel free to just drop my version on top of
>the one the installer put into $DELPHINHOME/lkb/etc/.  also, note that
>you will have to use do-parse-tty() (or other LKB :tty mode functions)
>whenever you need to input accented characters; we will have to report
>the problems with the `Parse | Parse input' dialogue to the vendor of
>the graphics toolkit we use for the LKB.
>
>a few more remarks.  i noticed that loading your grammar is slow with
>the ready-to-run LKB binaries that we distribute.  you can make things
>faster by requesting more memory at start-up, so that the LKB need not
>grow to a suitable size while loading the grammar.  put the following
>
>  (system:resize-areas :old (* 32 1024 1024) :new (* 128 1024 1024))
>
>into a file `~/.lkbrc' in your home directory to request a bigger LKB
>memory footprint at start-up.
>
>also, to avoid potential for confusion regarding encodings, consider
>adding lines like the following
>
>  ;;; Hey, emacs(1), this is -*- mode: tdl; encoding: iso-8859-1 -*-
>
>to all files of your grammar.
>
>finally, a heads-up: ann has lately rewritten the morphology code from
>scratch, and that silly problem related to formatting i noticed above
>has been eliminated.  however, i also noted some issues when processing
>your grammar using the newer code.  i would like to forward the grammar
>and some notes on my findings to ann and the `developers' list.  do you
>agree to my doing that?
>
>                                                  all the best  -  oe
>
>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>+++ Universitetet i Oslo (ILN); Boks 1102 Blindern; 0317 Oslo; (+47) 2285 7989
>+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
>+++       --- oe at csli.stanford.edu; oe at hf.uio.no; stephan at oepen.net ---
>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>  
>




More information about the developers mailing list