Issue #9473 has been updated by Eric Wong.


 rasch / raschnet.com wrote:
 > 4. upgrading gems with C extensions
 
 Can you reproduce this without C extensions?
 Which C extensions do you run?  Likely one of them is corrupting
 memory, so it could be an odd/strange one somewhere..
 
 It looks like one of them (Pool2/Implementation.cpp) is passenger,
 so maybe try reproducing the error with unicorn?

----------------------------------------
Bug #9473: Corruption and Segmentation faults all over
https://bugs.ruby-lang.org/issues/9473#change-44903

* Author: David Rasch
* Status: Open
* Priority: Normal
* Assignee: 
* Category: 
* Target version: 
* ruby -v: ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]
* Backport: 
----------------------------------------
We're in the process of moving from Rails 2.3 to 3.2 (both running on Ruby 1.9.3-p484)

In this process we've run into a snag where we're seeing errors crop up within 2-3 hours of taking production traffic (or replays thereof with siege).  We cannot be certain that these errors would not occur with rails 2.3, however they appear more quickly and pervasively in the 3.2 branch.  

These corruptions sometimes appear as: (in places where these errors are highly improbable if not impossible):
"string contains null byte"
ActiveModel::MissingAttributeError "missing attribute: ..."
"undefined method `table_name' for false:FalseClass"

for example - this error doesn't make much/any sense:
  string contains null byte
  activesupport (3.2.16) lib/active_support/core_ext/class/attribute.rb:97:in `block in class_attribute'


As a result we've tried:
1. Upgrading ruby 1.9.3 HEAD
2. Removing our Garbage collection tweaks
3. Turning on/off different areas of our codebase
4. upgrading gems with C extensions

and run independent tests on most of these variables but haven't been able to isolate it.

We're assuming these spurious errors are also related to the segmentation faults we've been seeing.  I've attached some examples.
The segfaults have happened all over the place including GC, compile, str_replace.

We've tried running against valgrind to identify a root cause and it indicates (on several reproductions) the first error in st.c:330 in st_lookup.



---Files--------------------------------
valgrind.txt (28.9 KB)
segfault1.txt (983 KB)
segfault2.txt (1.02 MB)
segfault3.txt (1.05 MB)


-- 
http://bugs.ruby-lang.org/