Hi Robert,

Thankyou for the detailed reply. :)

On 30/11/11 00:07, Robert Klemme wrote:
> On Tue, Nov 29, 2011 at 1:43 PM, Garthy D
> <garthy_lmkltybr / entropicsoftware.com>  wrote:
>> I was wondering if running the (mark and sweep?) garbage collector manually
>> is supposed to collect (and call define_finalizer procs on) objects with no
>> remaining references. I would expect that the answer is yes, but it doesn't
>> seem to actually work that way.
>
> There are no guarantees whatsoever that *any* GC run will collect
> particular unreachable instances.  The only guarantee is that all
> finalizers are invoked eventually (unless of course in case of
> catastrophic crash) - even if it is at process termination time.

I suspected this might be the case- and it's certainly not an 
unreasonable assumption to make of a GC.

>> If you wrap the A.new in a loop instead, eventually it'll free up the other
>> instances of A en-masse (ie. calling the GC doesn't seem to do it, but
>> flooding memory until the GC triggers does).
>>
>> I have seen similar behavior in a large embedded Ruby program that I am
>> working on. In this case the Ruby objects in question have instance
>> variables that reference textures, and I really need the finalizer to be
>> called when all of the references to the textures are lost, so as to free up
>> the texture memory. At the moment they are being finalised at program exit,
>> when the available texture memory has long run out. This isn't good, and it
>> means I need to rewrite every potential bit of this code to use manual
>> reference counting.
>
> No, that's a bad solution.  What you observe might mean that you
> simply haven't created enough Ruby garbage for GC to think it needs to
> work.  I have no idea how you allocate texture memory but if it is in
> a C extension written by you I would check whether there is a way to
> go through MRI's allocation in order to correct MRI's idea of used
> memory.  Implementation of Ruby's String might be a good example for
> that.

Your assumption re the C extension for texture memory is pretty-much 
spot on. :)

Re texture memory, it's a little more complicated than a standard 
allocation. Some memory will be the standard allocated sort, but some 
will be on the video card, and they're effectively coming from different 
"pools" (or heaps), neither of which I'll have direct control over. 
Unless the Ruby GC directly understands this concept (I don't know if it 
does, but I'm guessing not), I'm not going to be able to use Ruby to 
manage that memory. Unfortunately, the problem goes a bit beyond just 
textures, as I'm wrapping a good chunk of a 3D engine in Ruby objects. 
That's my problem to worry about though.

And that's assuming I can redirect the allocation calls anyway- I'm not 
sure if I can. It'd be nice to be able to inform the Ruby GC of 
allocated memory (or an estimate, if it keeps changing) without actually 
leaving the GC to allocate it. Please correct me if I'm wrong, but I'm 
assuming this can't be done, and you must use the ALLOC/ALLOC_N-style 
functions?

> Another solution is to use transactions like File.open with a block
> does.  Then you know exactly when the texture is not used any more and
> can immediately release it (in "ensure").  Whether that is a viable
> option depends on the design of your application.  See here:
>
> http://blog.rubybestpractices.com/posts/rklemme/002_Writing_Block_Methods.html

Thankyou for the link. It does not appear to be suitable for say 
textures in my specific case (they are long-lived and freed at a later 
time), but may help with some of the other problems I need to solve.

>> So basically: If the garbage collector is called, are objects with no
>> remaining references supposed to be reaped during the call, and their
>> defined finalizers called? Whatever the answer- why is that? Is there an
>> official word on how this is supposed to work, and what can (and can't) be
>> relied upon?
>
> GC's prefer to decide themselves when and how they collect deadwood.
> There are usually only very few guarantees (see JVM spec for an
> example) in order to allow VM implementors maximum freedom and room
> for optimization.  The only hard guarantee is that an object won't be
> collected as long as it is strongly reachable.

This is completely reasonable of course.

Unfortunately it causes a lot of problems in my case, as I had assumed 
one thing from general reading on the topic, and observed another. 
That's my problem to deal with though, not anyone else's. Still, there 
seems to be lot of information online that talks about Ruby performing 
mark and sweep, either stating outright that unreferenced objects are 
freed on first GC, or heavily implying it at least. From your 
description, and my observations, this information appears to be 
incorrect. Basically, there seems to be a lot of misinformation about 
what the GC is doing. Apart from the source, is there some place where 
the correct behaviour is discussed, that could be referred to instead, 
particularly if someone is suggesting that the MRI GC implementation 
should immediately clean up these references on next GC (which, as we've 
established, isn't accurate)?

Garth