[developers] SBCL port

Ben Waldron bmw20 at cl.cam.ac.uk
Sat Oct 28 18:45:35 CEST 2006


I can now report that the ERG is able to run happily under SBCL. The 
memory problem was caused by the CL-PPCRE regex library which is used in 
the preprocossor and MT components. Interestingly, the LKB/ERG runs 
considerably faster and requires far less memory when running under SBCL 
that it does under Allegro. The standard SBCL distribution comes with 
512MB dynamic space for its heap, which suffices for the LKB/ERG when 
using the fix to the regex code; but it's likely a good idea to compile 
your own copy of SBCL with a larger dynamic space. Details below.

=

The CL-PPCRE regex library comes with a parameter 
ppcre::*regex-char-code-limit* which provides "[t]he upper exclusive 
bound on the char-codes of characters which can occur in character 
classes. Change this value BEFORE creating scanners if you don't need 
the full Unicode support of LW, ACL, or CLISP." The default if the value 
of CHAR-CODE-LIMIT, which depends on the Lisp implementation running; 
for Allegro it is 65536 (16-bit Unicode), for SBCL it is 1114112 (full 
Unicode). When creating a regex scanner, character classes are mapped to 
arrays of this size. Clearly, arrays of size 1114112 quickly eat up 
memory! To enable us to use the regex library under SBCL I've checked in 
the following fix: set ppcre::*regex-char-code-limit* to 65536 under 
SBCL by default. In the preprocessor code I go further and set it to a 
minimum sensible value; but the MT code will use my new default (oe, you 
will probably want to fix this sometime).

We are currently using version 1.2.3 of CL-PPCRE. The most recent 
version is 1.2.18. This incorporates a number of bug fixes (none of 
which appear to directly affect our code). I would like to upgrade the 
LKB to the latest version. Does anyone have any reason to want to keep 
the current version? (tsdb tells me it is safe to upgrade.)

When running the LKB/ERG over the CSLI test suite I get the following 
figures: Allegro takes 647.0 CPU seconds (652.3 real time) and requires 
458MB memory; SBCL takes 488.2 seconds (490.4 real time) and requires 
196MB memory. That is, under SBCL speed is improved by 24.6% and memory 
usage by 57.2% (!).

To create a custom build of SBCL with more dynamic space alter 
def!constant dynamic-space-end in the SBCL source code and compile. Eg.

--- sbcl-0.9.17/src/compiler/x86/parms.lisp     2006-10-28 
18:35:05.000000000 +0200
+++ sbcl-0.9.17/src/compiler/x86/parms.lisp.OLD 2006-10-28 
18:34:52.000000000 +0200
@@ -175,7 +175,7 @@
   (def!constant static-space-end          #x07fff000)

   (def!constant dynamic-space-start       #x09000000)
-  (def!constant dynamic-space-end         #x49000000)
+  (def!constant dynamic-space-end         #x29000000)

   (def!constant linkage-table-space-start #x70000000)
   (def!constant linkage-table-space-end   #x7ffff000))

- Ben

Ben Waldron wrote:
>  the ERG causes the Lisp process to die (below) due to a memory 
> problem. I think this can probably be fixed by creating a custom build 
> of the SBCL compiler (apparently one should: edit DYNAMIC-SPACE-END in 
> src/compiler/.../params.lsp) -- but I've not tried this.
> ===
>
> * (read-script-file-aux "/home/bmw20/erg/lkb/script")
>
> ; in: LAMBDA NIL
> ;     (SETF LKB::*TRANSLATE-GRID* '(:EN :EN))
> ; ==>
> ;   (SETQ LKB::*TRANSLATE-GRID* '(:EN :EN))
> ;
> ; caught WARNING:
> ;   undefined variable: *TRANSLATE-GRID*
>
> ;
> ; caught WARNING:
> ;   This variable is undefined:
> ;     *TRANSLATE-GRID*
> ;
> ; compilation unit finished
> ;   caught 2 WARNING conditions
> STYLE-WARNING: redefining ESTABLISH-LINEAR-PRECEDENCE in DEFUN
> STYLE-WARNING: redefining SPELLING-CHANGE-RULE-P in DEFUN
> STYLE-WARNING: redefining REDUNDANCY-RULE-P in DEFUN
> STYLE-WARNING: redefining HIDE-IN-TYPE-HIERARCHY-P in DEFUN
> STYLE-WARNING: redefining MAKE-UNKNOWN-WORD-SENSE-UNIFICATIONS in DEFUN
> STYLE-WARNING: redefining INSTANTIATE-GENERIC-LEXICAL-ENTRY in DEFUN
> STYLE-WARNING: redefining MAKE-ORTH-TDFS in DEFUN
> STYLE-WARNING: redefining SET-TEMPORARY-LEXICON-FILENAMES in DEFUN
> STYLE-WARNING: redefining DAG-INFLECTED-P in DEFUN
> STYLE-WARNING: redefining BOOL-VALUE-TRUE in DEFUN
> STYLE-WARNING: redefining BOOL-VALUE-FALSE in DEFUN
> STYLE-WARNING: redefining DETERMINE-ARGUMENT-OPTIONALITY in DEFUN
> STYLE-WARNING: redefining DETERMINE-DERIVED-FORMS in DEFUN
> STYLE-WARNING: redefining LUI-CHART-EDGE-NAME in DEFUN
> STYLE-WARNING: redefining CHAIN-DOWN-MARGS in DEFUN
> STYLE-WARNING: redefining IDIOM-REL-P in DEFUN
> set-coding-system(): ignoring request for UTF8 on this Lisp 
> implementation.
> Reading in type file fundamentals.tdl
> Reading in type file lextypes.tdl
> Reading in type file syntax.tdl
> Reading in type file lexrules.tdl
> Reading in type file auxverbs.tdl
> Reading in type file mtr.tdl
> Checking type hierarchy
> Checking for unique greatest lower bounds
> Expanding constraints
> Making constraints well formed
> Expanding defaults
> Type file checked successfully
> Computing display ordering
> Reading in cached leaf types
> Cached leaf types read
> Reading in cached lexicon (main)
> Cached lexicon read
> Reading in rules file constructions.tdl
> Reading in lexical rules file inflr.tdl
> Reading in lexical rules file inflr-pnct.tdl
> Reading in root file roots.tdl
> Reading in lexical rules file lexrinst.tdl
> Reading in parse node file parse-nodes.tdl
> Evaluation took:
>  13.495 seconds of real time
>  12.141154 seconds of user run time
>  0.781881 seconds of system run time
>  0 page faults and
>  272,134,856 bytes consed.
> read-vpm(): reading variable property mapping `semi.vpm'.
> read-transfer-rules(): reading file `idioms.mtr'.
> read-transfer-rules(): reading file `generation.mtr'.
> read-transfer-rules(): reading file `trigger.mtr'.
> Argh! gc_find_freeish_pages failed (restart_page), nbytes=4456456.
>   Gen Boxed Unboxed LB   LUB  !move  Alloc  Waste   Trig    WP  GCs 
> Mem-age
>   0:    20     0  2178     0     0  8986536 16472  2000000    0   0  
> 0.0000
>   1:    28     2     0 13068    17 53589904 59504  2000000    8   0  
> 0.4999
>   2: 15185   415    54 71893    21 358134904 457608 360134904 15152   
> 1  0.0000
>   3:  1315   196    44     0    74  6299280 70000  2000000  751   0  
> 0.0000
>   4:  6343   767  1839   354   132 37900176 204912  2000000 7388   0  
> 0.0000
>   5:     0     0     0     0     0        0     0  2000000    0   0  
> 0.0000
>   6:  6071     0     0     0     0 24866816     0  2000000 5884   0  
> 0.0000
>   Total bytes allocated=489777616
> fatal error encountered in SBCL pid 32736(tid 3085371072):
>
>
> The system is too badly corrupted or confused to continue at the Lisp
> level. If the system had been compiled with the SB-LDB feature, we'd drop
> into the LDB low-level debugger now. But there's no LDB in this build, so
> we can't really do anything but just exit, sorry.




More information about the developers mailing list