I can confirm that removing mbari2 fixes the issue. I was able to get
a better stack trace, but am still unsure about the root cause and
unable to reproduce it consistently. It seems like a double free is
occurring for some reason and that eventually causes the segfault.

*** glibc detected *** free(): invalid pointer: 0x0000000002312734 ***
*** glibc detected *** free(): invalid pointer: 0x0000000002312734 ***

Core was generated by `ruby gems/local/gems/god-0.7.8/bin/god'.
Program terminated with signal 6, Aborted.
#0  0x00007fc3d0cbb07b in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007fc3d0cbb07b in raise () from /lib/libc.so.6
#1  0x00007fc3d0cbc84e in abort () from /lib/libc.so.6
#2  0x00007fc3d0cf15f9 in __fsetlocking () from /lib/libc.so.6
#3  0x00007fc3d0cf8163 in mallopt () from /lib/libc.so.6
#4  0x00007fc3d0cf81ee in free () from /lib/libc.so.6
#5  0x00007fc3d134b4b8 in time_free (tobj=3D0x2312734) at time.c:43
#6  0x00007fc3d12dfed9 in rb_gc_call_finalizer_at_exit () at gc.c:2324
#7  0x00007fc3d12b6fd9 in ruby_finalize_1 () at eval.c:1561
#8  0x00007fc3d12b7146 in ruby_cleanup (ex=3D0) at eval.c:1598
#9  0x00007fc3d12b733c in ruby_stop (ex=3D0) at eval.c:1653
#10 0x00007fc3d1317306 in rb_f_fork (obj=3D140478970802560) at process.c:13=
43
#11 0x00007fc3d12c425a in call_cfunc (func=3D0x7fc3d1317286 <rb_f_fork>,
recv=3D140478970802560, len=3D0, argc=3D0, argv=3D0x0) at eval.c:5759
#12 0x00007fc3d12c3535 in rb_call0 (klass=3D140479007795520,
recv=3D140478970802560, id=3D5321, oid=3D5321, argc=3D0, argv=3D0x0,
body=3D0x7fc3d159dae8, flags=3D2) at eval.c:5911
#13 0x00007fc3d12c4d84 in rb_call (klass=3D140479007795520,
recv=3D140478970802560, mid=3D5321, argc=3D0, argv=3D0x0, scope=3D1,
self=3D140478970802560) at eval.c:6158
#14 0x00007fc3d12bc82b in rb_eval (self=3D140478970802560,
n=3D0x7fc3cfde5f18) at eval.c:3508
#15 0x00007fc3d12bb0d2 in rb_eval (self=3D140478970802560,
n=3D0x7fc3cfde5f40) at eval.c:3223
#16 0x00007fc3d12bd827 in rb_eval (self=3D140478970802560,
n=3D0x7fc3cfde5d60) at eval.c:3678
#17 0x00007fc3d12bb8dc in rb_eval (self=3D140478970802560,
n=3D0x7fc3cfde57c0) at eval.c:3357
#18 0x00007fc3d12ba068 in rb_eval (self=3D140478970802560,
n=3D0x7fc3cfde6878) at eval.c:2962
#19 0x00007fc3d12c3dfc in rb_call0 (klass=3D140478982783720,
recv=3D140478970802560, id=3D38449, oid=3D38449, argc=3D0,
argv=3D0x7fffd95ab848, body=3D0x7fc3cfde6878, flags=3D0) at eval.c:6062
#20 0x00007fc3d12c4d84 in rb_call (klass=3D140478982783720,
recv=3D140478970802560, mid=3D38449, argc=3D1, argv=3D0x7fffd95ab840, scope=
=3D0,
self=3D140478970803320) at eval.c:6158
#21 0x00007fc3d12bc4f1 in rb_eval (self=3D140478970803320,
n=3D0x7fc3cfe124a0) at eval.c:3493
#22 0x00007fc3d12c3dfc in rb_call0 (klass=3D140478982957680,
recv=3D140478970803320, id=3D38449, oid=3D38449, argc=3D0,
argv=3D0x7fffd95ac3f0, body=3D0x7fc3cfe124a0, flags=3D0) at eval.c:6062
#23 0x00007fc3d12c4d84 in rb_call (klass=3D140478982957680,
recv=3D140478970803320, mid=3D38449, argc=3D2, argv=3D0x7fffd95ac3e0, scope=
=3D1,
self=3D140478970803320) at eval.c:6158
#24 0x00007fc3d12bc82b in rb_eval (self=3D140478970803320,
n=3D0x7fc3cfe135d0) at eval.c:3508
#25 0x00007fc3d12ba068 in rb_eval (self=3D140478970803320,
n=3D0x7fc3cfe12900) at eval.c:2962
#26 0x00007fc3d12c3dfc in rb_call0 (klass=3D140478982957680,
recv=3D140478970803320, id=3D24553, oid=3D24553, argc=3D0,
argv=3D0x7fffd95ad6f8, body=3D0x7fc3cfe12900, flags=3D0) at eval.c:6062
---Type <return> to continue, or q <return> to quit---
#27 0x00007fc3d12c4d84 in rb_call (klass=3D140478982957680,
recv=3D140478970803320, mid=3D24553, argc=3D1, argv=3D0x7fffd95ad6f0, scope=
=3D0,
self=3D140478970803320) at eval.c:6158
#28 0x00007fc3d12bc4f1 in rb_eval (self=3D140478970803320,
n=3D0x7fc3d0b50068) at eval.c:3493
#29 0x00007fc3d12ba068 in rb_eval (self=3D140478970803320,
n=3D0x7fc3d0b42648) at eval.c:2962
#30 0x00007fc3d12c3dfc in rb_call0 (klass=3D140478996330640,
recv=3D140478970803320, id=3D24537, oid=3D24537, argc=3D0,
argv=3D0x7fffd95aea48, body=3D0x7fc3d0b42648, flags=3D0) at eval.c:6062
#31 0x00007fc3d12c4d84 in rb_call (klass=3D140478996330640,
recv=3D140478970803320, mid=3D24537, argc=3D1, argv=3D0x7fffd95aea40, scope=
=3D0,
self=3D140478970803320) at eval.c:6158
#32 0x00007fc3d12bc4f1 in rb_eval (self=3D140478970803320,
n=3D0x7fc3d0af5500) at eval.c:3493
#33 0x00007fc3d12bb651 in rb_eval (self=3D140478970803320,
n=3D0x7fc3d0b0bbc0) at eval.c:3309
#34 0x00007fc3d12c3dfc in rb_call0 (klass=3D140478996330640,
recv=3D140478970803320, id=3D26833, oid=3D26833, argc=3D0,
argv=3D0x7fffd95afd78, body=3D0x7fc3d0b0bbc0, flags=3D0) at eval.c:6062
#35 0x00007fc3d12c4d84 in rb_call (klass=3D140478996330640,
recv=3D140478970803320, mid=3D26833, argc=3D1, argv=3D0x7fffd95afd70, scope=
=3D0,
self=3D140478970802960) at eval.c:6158
#36 0x00007fc3d12bc4f1 in rb_eval (self=3D140478970802960,
n=3D0x7fc3cfe1cd88) at eval.c:3493
#37 0x00007fc3d12ba068 in rb_eval (self=3D140478970802960,
n=3D0x7fc3cfe1ca18) at eval.c:2962
#38 0x00007fc3d12c3dfc in rb_call0 (klass=3D140478983025360,
recv=3D140478970802960, id=3D26777, oid=3D26777, argc=3D0, argv=3D0x0,
body=3D0x7fc3cfe1ca18, flags=3D0) at eval.c:6062
#39 0x00007fc3d12c4d84 in rb_call (klass=3D140478983025360,
recv=3D140478970802960, mid=3D26777, argc=3D0, argv=3D0x0, scope=3D0,
self=3D140478970802960) at eval.c:6158
#40 0x00007fc3d12bc4f1 in rb_eval (self=3D140478970802960,
n=3D0x7fc3cfe1dfd0) at eval.c:3493
#41 0x00007fc3d12bb651 in rb_eval (self=3D140478970802960,
n=3D0x7fc3cfe1d8c8) at eval.c:3309
#42 0x00007fc3d12c0e81 in rb_yield_0 (val=3D6, self=3D140478970802960,
klass=3D0, flags=3D0, avalue=3D0) at eval.c:5083
#43 0x00007fc3d12c1553 in loop_i () at eval.c:5216
#44 0x00007fc3d12c2316 in rb_rescue2 (b_proc=3D0x7fc3d12c152e <loop_i>,
data1=3D0, r_proc=3D0, data2=3D0) at eval.c:5480
#45 0x00007fc3d12c15ca in rb_f_loop () at eval.c:5241
#46 0x00007fc3d12c425a in call_cfunc (func=3D0x7fc3d12c1593 <rb_f_loop>,
recv=3D140478970802960, len=3D0, argc=3D0, argv=3D0x0) at eval.c:5759
#47 0x00007fc3d12c3535 in rb_call0 (klass=3D140479007795520,
recv=3D140478970802960, id=3D4121, oid=3D4121, argc=3D0, argv=3D0x0,
body=3D0x7fc3d15b6b88, flags=3D2) at eval.c:5911
#48 0x00007fc3d12c4d84 in rb_call (klass=3D140479007795520,
recv=3D140478970802960, mid=3D4121, argc=3D0, argv=3D0x0, scope=3D1,
self=3D140478970802960) at eval.c:6158
#49 0x00007fc3d12bc82b in rb_eval (self=3D140478970802960,
n=3D0x7fc3cfe1d850) at eval.c:3508
#50 0x00007fc3d12bb0d2 in rb_eval (self=3D140478970802960,
n=3D0x7fc3cfe1d828) at eval.c:3223
#51 0x00007fc3d12c0e81 in rb_yield_0 (val=3D140478970802760,
self=3D140478970802960, klass=3D0, flags=3D1, avalue=3D2) at eval.c:5083
#52 0x00007fc3d12d21d5 in rb_thread_yield (arg=3D140478970802760,
th=3D0x230b190) at eval.c:12426
#53 0x00007fc3d12d1e60 in rb_thread_start_0 (fn=3D0x7fc3d12d20f3
<rb_thread_yield>, arg=3D0x7fc3cf273248, th=3D0x230b190) at eval.c:12344
---Type <return> to continue, or q <return> to quit---
#54 0x00007fc3d12d2327 in rb_thread_initialize
(thread=3D140478970802800, args=3D140478970802760) at eval.c:12500
#55 0x00007fc3d12c4223 in call_cfunc (func=3D0x7fc3d12d2257
<rb_thread_initialize>, recv=3D140478970802800, len=3D-2, argc=3D0,
argv=3D0x0) at eval.c:5753
#56 0x00007fc3d12c3535 in rb_call0 (klass=3D140479007761480,
recv=3D140478970802800, id=3D2961, oid=3D2961, argc=3D0, argv=3D0x0, body=
=3D0x0,
flags=3D4) at eval.c:5911
#57 0x00007fc3d12c3535 in rb_call0 (klass=3D140479007761480,
recv=3D140478968811240, id=3D333, oid=3D333, argc=3D2, argv=3D0x7fffd95b43b=
0,
body=3D0x7fc3d15b1890, flags=3D0) at eval.c:5911
#58 0x00007fc3d12c4d84 in rb_call (klass=3D140479007761480,
recv=3D140478968811240, mid=3D333, argc=3D2, argv=3D0x7fffd95b43b0, scope=
=3D0,
self=3D140478969580760) at eval.c:6158
#59 0x00007fc3d12bb0d2 in rb_eval (self=3D140478969580760,
n=3D0x7fc3d0750d50) at eval.c:3223
#60 0x000000000256edd0 in ?? ()
#61 0x000000000256f068 in ?? ()
#62 0x00007fffd95b4bd0 in ?? ()
#63 0x00007fffd95bbd90 in ?? ()
#64 0x0000000000000007 in ?? ()
#65 0x00007fffd95b4df0 in ?? ()
#66 0x00007fc3d12d02e3 in rb_thread_schedule () at eval.c:11251
Previous frame inner to this frame (corrupt stack?)

(gdb) define rb_trace
>  set $frame =3D ruby_frame
>  while $frame
 >    set $node =3D $frame->node
 >    print $node->nd_file
 >    print ((unsigned int)(($node->flags>>19)&35184372088831)) # nd_line m=
acro
 >    set $frame =3D $frame->prev
 >  end
>end

(gdb) rb_trace
$16 =3D 0x253cc31 "./gems/local/gems/god-0.7.8/bin/../lib/god/process.rb"
$17 =3D 215
$18 =3D 0x250ff11 "./gems/local/gems/god-0.7.8/bin/../lib/god/watch.rb"
$19 =3D 154
$20 =3D 0x250ff11 "./gems/local/gems/god-0.7.8/bin/../lib/god/watch.rb"
$21 =3D 117
$22 =3D 0x2393c51 "./gems/local/gems/god-0.7.8/bin/../lib/god/task.rb"
$23 =3D 171
$24 =3D 0x2393c51 "./gems/local/gems/god-0.7.8/bin/../lib/god/task.rb"
$25 =3D 344
$26 =3D 0x2507e61 "./gems/local/gems/god-0.7.8/bin/../lib/god/driver.rb"
$27 =3D 68
$28 =3D 0x2507e61 "./gems/local/gems/god-0.7.8/bin/../lib/god/driver.rb"
$29 =3D 41
$30 =3D 0x2507e61 "./gems/local/gems/god-0.7.8/bin/../lib/god/driver.rb"
$31 =3D 36
$32 =3D 0x2507e61 "./gems/local/gems/god-0.7.8/bin/../lib/god/driver.rb"
$33 =3D 36
$34 =3D 0x2507e61 "./gems/local/gems/god-0.7.8/bin/../lib/god/driver.rb"
$35 =3D 35
$36 =3D 0x2507e61 "./gems/local/gems/god-0.7.8/bin/../lib/god/driver.rb"
$37 =3D 35

God uses a double-fork to spawn processes, and it looks like the
double free usually occurs when the first forked process (in
process.rb:215) dies. God also uses a C extension
(http://github.com/mojombo/god/blob/master/ext/god/netlink_handler.c)
which could be causing issues across the fork.

  Aman

On Tue, Mar 10, 2009 at 8:46 PM, Brent Roman <brent / mbari.org> wrote:
>
> Aman,
>
> When I merge the MBARI patches with 1.8 HEAD, I also plan to replace the
> stack optimization introduced in the MBARI2 patch with the (better) threa=
d
> anchors already in HEAD (which, I think, were originally backported from
> 1.9). =A0 This should happen in the next week or so. =A0In the meantime, =
you
> might want to try this patch against the current (MBARI 8B) patches on 1.=
8.6
> or 1.8.7:
>
> http://www.nabble.com/file/p22385077/rmMBARI2.patch
>
> It just disables the MBARI2 patch and leaves the rest intact.
> It would be very helpful to find out whether or not that alone eliminates
> God's segfaults.
>
> Will you give this a try?
> If it works, I'll do an 8C patch that to replace the stack splicing of
> MBARI2 with stack anchors on 1.8.7-p72 and perhaps 1.8.6-p287 as well.
>
> - brent
>
>
> Aman Gupta-6 wrote:
>>
>> I am continuing to see random segfaults on x86_64, especially with god
>> (http://god.rubyforge.org/), which makes liberal use of threads and
>> forking.
>>
>> ...
>>
>> So far I've been unable to come up with a reproducible test case, but
>> I've managed to narrow the problem down to mbari2. Vanilla ruby 1.8.7
>> does not have this issue, whereas 1.8.7+mbari2 will segfault randomly
>> every few days.
>>
>> Perhaps it is worth backporting thread anchors from ruby 1.8 HEAD?
>>
>> =A0 Aman
>>
>>
>
> --
> View this message in context: http://www.nabble.com/-ruby-core%3A19846---=
Bug--744--memory-leak-in-callcc--tp20447794p22448384.html
> Sent from the ruby-core mailing list archive at Nabble.com.
>
>
>