Hey all,

Okay, I have some more information.

Re the use of alloca in foo.cpp, I remembered a GNU extension that I rarely
use, and it can be safely swapped in. In foo.cpp, replace:

   unsigned char *data = (unsigned char *)alloca(ss);

with:

   unsigned char data[ss];

I generally prefer to keep my code portable, but this serves to shift the
potential blame away from alloca somewhat. The sample code still crashes,
just not in the same point. alloca is not to blame.

I've rebuilt the same foo.cpp code with multiple versions of Ruby and
tabulated the results:

(best viewed in fixed-width)

Version                           Mode 3    Mode 1
-------                           ------    ------
1.8.5-p12                            125
1.8.4                                 91
1.6.8                                  0         0
Stable repos snapshot                125
Nightly checkout snapshot              0         0

Snapshots were downloaded on 22/1/2007, SA Central time (SA).

In each case any minor modifications required to make it build (eg. fixing
prototypes) were made.

In all cases the sample code (foo.cpp) crashed.

Mode 3 refers to the iteration that it crashed when run with "messy".
Mode 1 refers to crashes when run on its own, ie. without "messy".
A result of 0 means it crashed before it started iterating. Blank means it
was not run- I only bothered in the cases where Mode 3 crashed quickly.

The gist is that the sample code provided will crash on every version I
tried (1.6.8, 1.8.4, 1.8.5-p12, latest stable and unstable CVS). It is fair
to assume it will fail on many more.

Approaching the problem from another angle, I figured I'd try different
tricks when building Ruby itself. For the following tests I used 1.8.5-p12
as a base.

Modification                        Mode 3  Mode 1
------------                        ------  ------
Base (for comparison)                  125
Link with static version instead       ~90              [1]
CPPFLAGS="-g -pg"                       19
Strip out -O2                            0     153      (!?!?)
Replace -O2 with -O                     18

[1] - Forgot to record actual number, but it was around 90 or so.

And last of all, I just wanted to confirm that when running in the default
configuration without "-pg", the sample code made 10000 iterations without
a crash. Took most of the day to run too. ;)

Other pertinent information from my previous post:

 > uname -a
Linux notimportant 2.6.18-1.2798.fc6 #1 SMP Mon Oct 16 14:54:20 EDT 2006 i686 
athlon i386 GNU/Linux

 > g++ -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man 
--infodir=/usr/share/info --enable-shared --enable-threads=posix 
--enable-checking=release --with-system-zlib --enable-__cxa_atexit 
--disable-libunwind-exceptions --enable-libgcj-multifile 
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk 
--disable-dssi --enable-plugin 
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic 
--host=i386-redhat-linux
Thread model: posix
gcc version 4.1.1 20061011 (Red Hat 4.1.1-30)

Maybe it is a gcc 4.* thing? Anyone else running a similar system (ie. FC6)
able to give things a shot?

So there we go. I'm not sure what else I can provide at this point. I've
provided sample code that triggers the crash, tried it on multiple versions
of Ruby, isolated it down to the -pg switch, so forth. I believe I've fielded
any concerns on the sample code adequately. This is a fairly significant bug
for anyone embedding Ruby in a project that needs profiling information-
assuming it affects other people as well- as it seems to affect multiple
versions of Ruby and makes the program unreliable and crash-prone.

My offer to assist remains open; I can be reached via email from the contact
page on www.entropicsoftware.com. I do hope that the days of work I've put
into diagnosing this problem somehow pays off in locating the source of this
significant bug.

Now I feel it is time to take a break from Embedded Ruby and play around with
some alternate solutions. Working on this has certainly been an interesting
experience.

Please don't hesitate to ask if I can provide any additional information or
data from additional tests.

Take care,
Garth