A little more on this...

On 30-Nov-06, at 10:36 AM, Bob Hutchison wrote:

> Hi,
>
> I'm getting a 'Segmentation fault' in ruby 1.8.5 running on debian  
> in a Xen VPS. The same code running on OS X and a different version  
> of linux has no problems.
>
> The process to get this is maybe a little strange.
>
> 1) read a large file into a string (1.3MB)
> 2) eval the string (the string is a single ruby proc definition  
> that when called will build an object structure in memory)
> 3) call the proc --> Segmentation fault *very* soon after
>
> The file was generated by the same program but it was running but  
> on a different machine, in this case the other linux box I  
> mentioned above.
>
> Knowning full well that there can be all kinds of differences  
> between the linuxes, I'll claim that the only interesting  
> difference that I can find is/was in the architectures reported by  
> ruby --version: on the machine that works reports i686-linux, the  
> machine that doesn't reports i386-linux -- so I rebuilt a version  
> that was also i686 and, of course, this made no difference. So all  
> that means is that I can't find the truly interesting difference.
>
> If I edit the file from where the string is read, and replace a  
> bunch of assignments of a particular type of object (the objects  
> are still created) (about 6000 of them) then the problem  
> disappears. There's nothing special about the objects I got rid of,  
> it was just easy to use regular expressions to identify them and  
> get rid of their assignment.
>
> If I try running ruby through gdb there is a SIGSEGV signal at  
> eval.c:2890 -- which is the unknown_node method but I can't get a  
> more complete stacktrace (until I figure out how to build ruby with  
> the debug information not stripped out). Manually poking around  
> though, method_call calls rb_call0 calls unknown_node so I'm  
> betting on this. And so? Well maybe the eval of the string produced  
> an invalid proc object? What's the cause of this? Too long a  
> string? too many objects in the eval? too big a proc object? But  
> why work on one linux box and fail on the other?

So I put some printf into the eval.c file and it turns out that  
rb_eval is called recursively 5301 times before seg faulting, while  
trying to handle a NODE_DASGN_CURR node. There are no other eval node  
types being evaluated when this begins, every node is a NODE_DASGN_CURR.

There is nothing that is anywhere that deep in the script that I am  
evaluating. So it looks as though the proc object is corrupt??

So maybe this is reproducible?? Well, so it is. If I run this script:

module SomeModule
   def initialize
     @@proc = nil
   end

   def SomeModule.build
     if @@proc then
       result = @@proc.call
       @@proc = nil
       return result
     end
   end
end

N = 5000

the_string = ""

the_string << "module SomeModule\n"
the_string << "  @@proc = Proc.new {\n"
the_string << "    thing = []\n"

N.times do | i |
   the_string <<  "    v#{i} = [#{i}]\n"
end

N.times do | i |
   the_string <<  "    thing << v#{i}\n"
end

the_string << "    thing\n"
the_string << "  } #proc\n"
the_string << "end\n"

puts("the_string length: #{the_string.length}")
eval(the_string, nil, "ruby_definition", 0)
SomeModule.build


It will fail on the one linux box, run on the other, and run on OS X.  
With a little binary search, the smallest N that causes the segfault  
is 3024 (3023 works).

Does this help?




>
> I'm wondering if anyone has seen anything like this before or maybe  
> have any experience debugging this kind of thing? Any suggestions  
> very much appreciated.
>
> Thanks,
> Bob
>
> ----
> Bob Hutchison                  -- blogs at <http://www.recursive.ca/ 
> hutch/>
> Recursive Design Inc.          -- <http://www.recursive.ca/>
> Raconteur                      -- <http://www.raconteur.info/>
> xampl for Ruby                 -- <http://rubyforge.org/projects/ 
> xampl/>
>
>
>
>

----
Bob Hutchison                  -- blogs at <http://www.recursive.ca/ 
hutch/>
Recursive Design Inc.          -- <http://www.recursive.ca/>
Raconteur                      -- <http://www.raconteur.info/>
xampl for Ruby                 -- <http://rubyforge.org/projects/xampl/>