Stephen, I updated the MBARI7 patch at http://sites.google.com/site/brentsrubypatches again last night (on 1/11/09) before I'd read your post. (Sorry) I had already concluded that -mpreferred-stack-boundary=2 is generally a "bad idea" and have removed it from the recommended options. It has portability problems and, even where it works, the net loss in speed is not worth the small reduction stack usage for most Ruby scripts. One option that does increase speed about 7% across the board is -fomit-frame-pointer. It seems to work well with most recent gcc compilers, but segaults on older ones, so I'm not recommending it by default. I believe that microsoft 'C' has an analogous option. This latest update to MBARI7 adds a configuration option to select the method used to clear the stack among four alternatives. The default is to use a (new) portable method that allocates the "dirty" stack briefly with alloca() before clearing it. This portable method costs time (~1.5%), but it is safer. In practice, The 32-bit x86 is so starved for registers that I'd seen cases where gcc would emit a PUSH %ESP between the point in the (old, fast) stack clearing routine that read the stack pointer and the loop that was to zero unallocated stack above the top. This would cause the stacked base pointer to be cleared as well and yield segfault when it was later POP'ed from the stack. Fortunately, if this happens, the resulting Ruby binary fails immediately on the (bogus1.rb and bogus1.rb) test scripts included with the patches. Ironically, -mpreferred-stack-boundary=2 will make the new, portable stack clearing method ineffective due to gcc's insistence that alloca(x>0) always return a 16-byte aligned pointer regardless of the configured preferred-stack-boundary. This might be considered a bug, but I'm honestly not sure. I cannot seem to find a stack clearing method that is both safe and portable. Maybe others will succeed where I have punted. For now, my tests indicate that, on 32-bit x86 with gcc 4.3, the combination of CFLAGS="-O2 -fomit-frame-pointer -fno-stack-protector" and #define STACK_WIPE_SITES 0x4370 /* in rubysig.h */ works best. It protects against ghost references well and runs even micro-benchmarks slightly faster than unpatched 1.8.7-p72. - brent Stephen Sykes-3 wrote: > > Brent, > > A report from the field... > > We have been using your patches in a production Rails environment > since you released them, and this is on x86_64-linux. > > We notice no problems, ruby works well and is significantly faster. > > And to keep up to date, we just applied patch MBARI7 (from > http://sites.google.com/site/brentsrubypatches/ ) with the default > configuration. FWIW we see a further small performance improvement, > something like 5% on a rough measurement. > > Just a note on your build instructions: the > -mpreferred-stack-boundary=2 flag causes configure to fail on OSX, > complaining that it can't find the size of int (the program to do so > segfaults). And that setting is not accepted by gcc on x86_64 because > it needs the boundary to be 4 or more. In both cases I removed the > option and all works fine. > > Regards, > Stephen > > > -- View this message in context: http://www.nabble.com/-ruby-core%3A19846---Bug--744--memory-leak-in-callcc--tp20447794p21422034.html Sent from the ruby-core mailing list archive at Nabble.com.