Some ideas based on OobGC in unicorn: I hate tuning knobs; so I would
like improvements to be transparent without user interaction.

Current OobGC:

1) works because GC happens when the stack is shallow

2) works because GC happens when we would otherwise be sleeping,
   not during allocation

1) is more important than 2): the more a GC can reap, the less it
runs.

The Ruby VM should detect these conditions pretty easily in
single-threaded mode.  However, this is bad with multiple threads.

Thread-friendly OobGC detection might use per-thread GC counters.
This way, threads allocating more objects run GC more.

Using online mean/stddev to track stack depth may be expensive.
A more effective GC will (hopefully) recover the costs of calculation.

1) being more important than 2) should still apply in MT.
more allocations -> more freeable objects
Short-lived object sharing across threads is probably uncommon.

I doubt I'll have time/knowledge/need to implement this for a
while.  If others have free time, please try :)  Thank you.

[1] - http://yahns.yhbt.net/README