hi,

i just ran into an interesting segfault this weekend.

i have a Ruby application that uses my D-BUS bindings. 

to receive messages, i register a Proc object with the extension API,
which then registers a wrapper callback in C which does:

rb_funcall(proc_obj_value, rb_intern("call"), n, ...);

this is all fine, and works well.

as it happens though, one of my callbacks now does something which
changes hardware status, causing HAL to emit a message which will be
delivered to this same callback.

now, i'm not entirely sure whether its being delivered from another OS
thread, i suspect it may be (its invocation is being dispatched by
D-BUS itself by way of GLib).

if it is being called from another OS thread, what would be the safest
way to ensure callback invocation is serialized? Thread.critical?

on a slower machine, the code as it stands causes an interpreter
coredump, with the following backtrace:

#0  0x4015acdf in raise () from /lib/tls/libc.so.6
#1  0x4015c4e5 in abort () from /lib/tls/libc.so.6
#2  0x40039afe in rb_bug (fmt=0x0) at error.c:214
#3  0x4009f262 in sigsegv (sig=11) at signal.c:446
#4  <signal handler called>
#5  0x400a18c3 in st_lookup (table=0x812b7d0, key=5209,
value=0xbfffd7c8) at st.c:259
#6  0x4003cd66 in search_method (klass=1087036240, id=5209,
origin=0xbfffd7f4) at eval.c:380
#7  0x4003cdc1 in rb_get_method_body (klassp=0xbfffd850,
idp=0xbfffd834, noexp=0xbfffd838) at eval.c:401
#8  0x40047d6c in rb_call (klass=1087036240, recv=1087038460,
mid=5209, argc=2, argv=0xbfffd868, scope=1) at eval.c:5713
#9  0x400481dc in vafuncall (recv=1087038460, mid=1769366852, n=2,
ar=0xbfffd8a4) at ruby.h:631
#10 0x4004825c in rb_funcall (recv=1769366852, mid=1769366852,
n=1769366852) at eval.c:5831
#11 0x40302fe0 in _real_call_filter_cb (args=3221215760) at
ruby-dbus-connection.c:598
#12 0x40046d19 in rb_protect (proc=0x40302f50 <_real_call_filter_cb>,
data=3221215760, state=0xbfffda0c) at eval.c:5163
#13 0x4030308b in _on_filter_message (conn=0x8202678, msg=0x8152e78,
user_data=0x80c8f38) at ruby-dbus-connection.c:619
#14 0x40322e0c in dbus_connection_dispatch (connection=0x8202678) at
dbus-connection.c:2550
#15 0x40379a78 in gsource_connection_dispatch (source=0x8152d18,
callback=0x69766544, user_data=0x69766544) at dbus-gmain.c:258
#16 0x403a3c30 in unblock_source () from /usr/lib/libglib-2.0.so.0
#17 0x403a4c98 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#18 0x403a4fad in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#19 0x403a56cf in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#20 0x4084445f in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#21 0x4070b8d0 in Init_gtk_curve () from
/usr/lib/ruby/site_ruby/1.8/i386-linux/gtk2.so
#22 0x40054cce in call_cfunc (func=0x4070b8a0 <Init_gtk_curve+1376>,
recv=1076360312, len=135444432, argc=5209, argv=0x400cf620) at
eval.c:5384
#23 0x40047535 in rb_call0 (klass=1086822540, recv=1076360312,
id=5073, oid=1769366852, argc=0, argv=0x0, body=0x40c796d4, nosuper=0)
at eval.c:5516
#24 0x40047e54 in rb_call (klass=1086822540, recv=1076360312,
mid=5073, argc=0, argv=0x0, scope=0) at eval.c:5737
#25 0x4004275b in rb_eval (self=1076489492, n=0x69766544) at eval.c:3240
#26 0x4004782c in rb_call0 (klass=1086750040, recv=1076489492,
id=5105, oid=1769366852, argc=0, argv=0xbfffe66c, body=0x402c11d4,
nosuper=0)
    at eval.c:5644
#27 0x40047e54 in rb_call (klass=1086750040, recv=1076489492,
mid=5105, argc=1, argv=0xbfffe668, scope=0) at eval.c:5737
#28 0x4004275b in rb_eval (self=1076476352, n=0x69766544) at eval.c:3240
#29 0x40041dde in rb_eval (self=1076476352, n=0x69766544) at eval.c:3030
#30 0x40041f4c in rb_eval (self=1076476352, n=0x69766544) at eval.c:3079
#31 0x4003e7dc in ruby_exec_internal () at eval.c:1467
#32 0x4003e834 in ruby_exec () at eval.c:1487
#33 0x4003e875 in ruby_run () at eval.c:1497
#34 0x08048713 in main (argc=1769366852, argv=0x69766544,
envp=0xbffffad0) at main.c:46
(gdb)

this occurs at the same place every time, and i've tried three
different Ruby versions (a stable snapshot before -preview3, -preview3
itself, and the latest stable snapshot (20041211).

this leads me to believe i'm doing something unexpected by the interpreter.

modifying the callback's Ruby code to contain:

def cb
Thread.critical = true
....
ensure
Thread.critical = false
end

seems to fix the problem as well.

i'd appreciate if someone had a good explanation for the best thing to
do in this case. i'd prefer to do it at the C level so that users of
the bindings don't have to worry about this.

thanks!
leon