Bugs item #11134, was opened at 2007-05-25 12:14
You can respond by visiting: 
http://rubyforge.org/tracker/?func=detail&atid=1698&aid=11134&group_id=426

Category: Triage (do not select)
Group: 1.8.6
>Status: Closed
>Resolution: Rejected
Priority: 3
Submitted By: Gonzalo Garramuno (gga)
Assigned to: Yukihiro Matsumoto (matz)
Summary: Init_stack and ruby_init_stack fail to reinit stack (threads problem?)

Initial Comment:
On an AMD64 box at least (albeit probably others), Init_stack and ruby_init_stack can fail to reinitialize the stack correctly.  This problem can happen when Init_stack is invoked from two different places, such as two different C threads (or two different DSOs).  At least, that's as far as I was able to reproduce the problem.
This problem is also present in the 1.9 branch.

The problem is that in those cases the STACK_UPPER() in those functions can fail incorrectly.  This can later lead to ruby incorrectly thinking it has run of stack space or to mark incorrectly some VALUEs.

There's the need for an init stack function that works reliably across threads, or, alternatively, for one that does not do the STACK_UPPER() check.


Here's a trace (with some printfs added to Init_stack and ruby_init_stack to show the problems).  This is Maya running its renderer which invokes ruby as a shading language.


ruby_init_stack: (nil) at START
ruby_init_stack: should reset to 0x7fffb12eb980
ruby_init_stack: 0x7fffb12eb980 at END
Init_stack: 0x7fffb12eb980 at START
Init_stack should reset to: 0x7fffb12eb964
Init_stack: 0x7fffb12eb980 at END   -- ** NOT RESET, but correct **
mark stack 0x7fffb12eb980 - 0x7fffb12e7820- MARKED

Starting Rendering /home/gga/maya/projects/default/images/untitled.iff.
Rendering current frame.  ** this inits a new thread **

ruby_init_stack: 0x7fffb12eb980 at START
ruby_init_stack: should reset to 0x418021d0
ruby_init_stack: 0x7fffb12eb980 at END   -- **NOT RESET, NOT CORRECT **



----------------------------------------------------------------------

Comment By: Gonzalo Garramuno (gga)
Date: 2007-05-25 22:29

Message:
I can probably give you 2 or 3 different types of patches to
address the problem, albeit I'm not sure which one you'd prefer.

For ruby1.8:

Personally, I'd keep Init_stack() and ruby_init_stack(), but
expose only ruby_init_stack() in ruby.h (as is now).
I'd add a new int (boolean) parameter to ruby_init_stack()
to allow avoiding the STACK_CHECK().  This would break
backwards compatability of this function, but as it has only
recently been introduced that should not be too terrible.
I'd merge Init_stack() to call ruby_init_stack() to avoid
code duplication.
Also, I'd rename ruby_init_stack() to rb_init_stack() to be
consistant with the rest of the api.
Finally, I'd probably remove the 1Mb clamping of the stack
space that is used for win32/linux (not sure what its
purpose is, either -- seems like if you are reading
getrlimit values you should use them as is).
---
A much simpler patch is to expose both Init_stack() and
ruby_init_stack() in ruby.h.  Init_stack() would have the
STACK_CHECK() while ruby_init_stack() would not.  The
laziest way of doing this patch involves a change to 2 or 3
lines.
A better patch would involve merging Init_stack and
ruby_init_stack by having one call the other.
---

For ruby1.9:

I'm not too familiar with the insides of ruby1.9 yet, so my
question first is how does each thread keep knowledge of its
own stack?  
Currently the gc.c in ruby1.9 seems almost identical to 1.8
in this aspect, so I'm a tad puzzled how this is working at
all now.  It seems like YARV as it is right now is not safe
at all for embedding.


----------------------------------------------------------------------

Comment By: Yukihiro Matsumoto (matz)
Date: 2007-05-25 19:12

Message:
Having both Init_stack() and ruby_init_stack() is mostly due to the historical reason.  It's the dark-side of bazaar development. ;-)

I can do little without the code.  Embedding situation differs for each application.  If you have a patch to suggest "the fix", I'd love to see.



----------------------------------------------------------------------

Comment By: Gonzalo Garramuno (gga)
Date: 2007-05-25 18:55

Message:
Thanks, Matz, but I know that.  I'm reopening the report again.

Ruby IS running as a single thread.  The main application is
not.  However, this should NOT prevent you from using ruby
by carefully locking it properly.

The main application calls Init_stack(), but not from main,
as it is an embedded ruby.

For this reason, with embedded applications,
reinitialization of the stack is always needed. 
Traditionally, this was done with Init_stack().

In this case the additional headache is that the application
may spawn a new thread between each run of ruby code (still,
as far as ruby is concerned, only one simultaneous thread
runs).  This makes Init_stack() or ruby_init_stack() not
work properly.

ruby prevents my tool from reseting the stack, due to the
STACK_CHECK() macro.
If the STACK_CHECK() macro is commented out (and the stack
is reset on each iteration) all works well, as my callback
can store/restore the previous stack value for each "thread"
and use a lock to make the application truly single threaded.

However, without being able to reset the start of the stack,
all is hopeless.

A function is needed to force reseting of the stack
regardless of how/where it was inited properly.   Also,
what's the point of having Init_stack() and
ruby_init_stack()?  They seem to be almost identical.

P.S. Problem is also present in ruby1.9 HEAD, btw.


----------------------------------------------------------------------

Comment By: Yukihiro Matsumoto (matz)
Date: 2007-05-25 18:18

Message:
In Ruby 1.8, ruby_init_stack() does not suppose to be called twice.  Re-initialization is not available.  Nor it does not support work with native threading (unless Ruby interpreter runs on single thread).

Wait for 1.9 if you want to work with native thread.

----------------------------------------------------------------------

You can respond by visiting: 
http://rubyforge.org/tracker/?func=detail&atid=1698&aid=11134&group_id=426