Thanks for your confirmation that "Ruby gc heuristic" is in practice 
really only a simple invocation of gc when GC_MALLOC_LIMIT has been
ALLOCated.

I have been thinking about this problem last night, especially the good
(counter) example given by Reimer.  Because of my philosophical "Ruby-C
separation" (let Ruby and C manage their own memories) for execution
performance reason, I came up with this solution.  In my malloc wrapper
function, I will also do simple counting on the memory allocated on the
C side so far:

    safe_malloc (int size)
    {
        total += size;
        if (total > threshold)
        {
            rb_gc ();
            threshold += GC_MALLOC_LIMIT;
        }
        if (ptr = malloc (size))
            ....
        else
            .....

    safe_free (void *ptr)
    {
        total -= .....
        free (....)

I think with this malloc wrapper function, the total garbage can be
limited to 2 * GC_MALLOC_LIMIT = 16 Mbytes.  Also, when the C memory
fluctuates, say, between 7 and 9 Mbytes, the Ruby gc will not be called
unnecessarily.  This solution will solve the problem, won't it?  Probably
there are other counter-examples that will defeat this solution?

My particular concern with rb_gc() is that it is linear (O(N)) in the
number of Ruby objects.  Therefore my philosophy is not to invoke
rb_gc() unnecessarily.

The idea of dynamic GC_MALLOC_LIMIT is very interesting.  I don't know
whether there has been some research on the algorithms for mark-and-sweep
gc.  For now, I think using a reasonable constant GC_MALLOC_LIMIT value is
adequate.

Regards,

Bill
============================================================================
Mauricio Fern?ndez <batsman.geo / yahoo.com> wrote:
>> Reimer Behrends <behrends / cse.msu.edu> wrote:
>> =================================================================
>> > A problem occurs if, say, you have an Image class that allocates memory
>> > for the pixel data using malloc():
>> 
>> > for i = 1..1000 do
>> >   img = Image.load(directory + i.to_s + ".png")
>> > end
>> 
>> > Assuming that each image occupies, say, 1 MB in memory, the program
>> > allocates roughly 1 GB of memory. Yet as far as Ruby knows, only a few
>> > kilobytes have been requested, so no garbage collection is initiated
>> > and the program starts chewing through all the available swap space,
>> > even though there is plenty of garbage to collect.
>> =================================================================

> This means the garbage collector will automatically be called after
> GC_MALLOC_LIMIT (8MB on sane platforms) bytes have been ALLOCated
> (malloc_memories is reset afterwards).
> But if the data you ALLOCated points to malloc()'ed data there will
> be indeed a lot of garbage, and Ruby doesn't know it.