[developers] [spr31047] CLIM bug on 64-bit machines.

Stephan Oepen oe at csli.Stanford.EDU
Mon Jan 23 17:25:31 CET 2006


hi ben et al.

yes, at UiO we have the same problem with 64-bit Linux and, i believe,
CLIM: i can compile the LKB code and bring up its CLIM top-level pane,
but as i make it compile a grammar the entire Lisp just:

  ioctl(5, FIONREAD, [0])                 = 0
  ioctl(5, FIONREAD, [0])                 = 0
  select(6, [5], [], [], {0, 0})          = 0 (Timeout)
  select(1, [0], NULL, NULL, {0, 0})      = 0 (Timeout)
  setitimer(ITIMER_REAL, {it_interval={2, 0}, it_value={2592000, 0}}, NULL) = 0
  setitimer(ITIMER_REAL, {it_interval={2, 0}, it_value={2, 0}}, NULL) = 0
  ioctl(5, FIONREAD, [0])                 = 0
  write(5, "=\0\4\0W\0@\3\2\0G\0\v\0008\0F\0\5\0N\0@\3+\0@\3\0\0\0"..., 224) \
    = 224
  read(5, 0x7fbffef970, 32)               = -1 EAGAIN \
    (Resource temporarily unavailable)
  select(6, [5], NULL, NULL, NULL)        = 1 (in [5])
  read(5, "\16\0\17\22U\0@\3\0\0>\0\0\0\0\0\0\0\0\0\0\0\0\0\33\2P"..., 32) = 32
  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
  rt_sigreturn(0x2a955fca00)              = 68858701378
  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
  rt_sigreturn(0x2a955fca00)              = 68858701378

  [...]

  rt_sigreturn(0x2a955fca00)              = 68858701378
  --- SIGSEGV (Segmentation fault) @ 0 (0) ---
  --- SIGSEGV (Segmentation fault) @ 0 (0) ---

the last message is repeated 11206 times before the process really dies
--- possibly we have overrun our signal queue or the like.

i attach the complete strace(1) output below; the file handling looks a
bit mysterious to me, i must say.  i even struggled when trying to work
out what file descriptor 5 is attached to at this point.  though maybe
it is actually irrelevant (see below).

the environment is RHEL4 on EAMT64: Linux ar.uio.no 2.6.9-22.0.1.EL.  i
get this problem both with ACL 7.0 (plus patches) and 8.0 (plus patches
as of today).  the Motif i am using is:

  0 oe at ar (~/acl80.64) 44 $ ls -l /usr/X11R6/lib64/libXm.*
  -rw-r--r--  1 root root 4410594 Apr  4  2005 /usr/X11R6/lib64/libXm.a
  lrwxrwxrwx  1 root root      14 May 12  2005 /usr/X11R6/lib64/libXm.so \
    -> libXm.so.3.0.2
  lrwxrwxrwx  1 root root      14 May 12  2005 /usr/X11R6/lib64/libXm.so.3 \
    -> libXm.so.3.0.2
  -rwxr-xr-x  1 root root 2705376 Apr  4  2005 /usr/X11R6/lib64/libXm.so.3.0.2
  0 oe at ar (~//acl80.64) 45 $ rpm -q -f /usr/X11R6/lib64/libXm.so.3.0.2
  openmotif-2.2.3-9.RHEL4.1

i believe that CLIM is the culprit, as i can run the same code without
any graphical display, and things work fine.

i believe the problem may be related to gc() cursors.  in fact:

  CL-USER(7): (excl:gc)
  Error: Received signal number 11 (Segmentation violation)
    [condition type: SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL]

  Restart actions (select using :continue):
   0: Return to Top Level (an "abort" restart).
   1: Abort entirely from this (lisp) process.
  [1] CL-USER(8): :cur
  (GC)
  [1] CL-USER(9): :bt
  Evaluation stack:

  GC <-
    [... EXCL::%EVAL ] <- EVAL <- TPL:TOP-LEVEL-READ-EVAL-PRINT-LOOP <-
    TPL:START-INTERACTIVE-TOP-LEVEL

hence it might be that a gc() triggered from the thread behind the CLIM 
pane lead to that long chain of SEGVs that eventually killed us?

at this point, i got curious:

  CL-USER(3): (excl:gc-before-c-hooks)
  (#(2510770148) #(182894372888))
  CL-USER(4): (setf (excl:gc-before-c-hooks) nil (excl:gc-after-c-hooks) nil)
  NIL

the latter appears to work around the problem:

  CL-USER(5): (excl:gc)
  scavenging...done eff: 97%, copy new: 9424 + old: 5621456 = 5630880
    Page faults: non-gc = 0 major + 582 minor, gc = 0 major + 1237 minor
  gc-after-hook(): local; new: 9424; old: 5621456; pending: 0; efficiency: 97.

finally, i attach the `/proc' maps for the process, and it would appear
that 2510770148 (== #x95a74fe4) is indeed an invalid pointer?

--- LKB users, i have checked in an attempted work-around.  please give
it a shot.

                                                      all best  -  oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2285 7989
+++     CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515
+++       --- oe at csli.stanford.edu; oe at hf.uio.no; stephan at oepen.net ---
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

-------------- next part --------------
A non-text attachment was scrubbed...
Name: strace.gz
Type: application/x-gunzip
Size: 232654 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20060123/1f3e1a39/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maps.gz
Type: application/x-gunzip
Size: 1601 bytes
Desc: not available
URL: <http://lists.delph-in.net/archives/developers/attachments/20060123/1f3e1a39/attachment-0001.bin>


More information about the developers mailing list