(2013/03/19 15:45), tmm1 (Aman Gupta) wrote:
> I agree this approach provides more flexibility. But GC hooks cannot allocate ruby objects or interact with GC, so it is tricky to use.

Yes exactly. This is why we need to be more carefully.
This is why I restrict only C function ([ruby-core:53530]).

However, it is difficult to make something.

So new idea (core idea of this proposal) is to introduce new another
API: register tasks invoking at finalizing timing.

Finalizing timing is:
* nearest timing to the GC
* free to Ruby execution (same as finalizer environment)

Summary of my proposal:
* Introduce new GC related hooks (restricted to C function)
  * Mark hook
  * Free hook
  * GCed hook
* Introduce new API to register a task invoking finalizing timing

Especially, Free hook and GCed hook is in GC procedure. In this C hooks,
collect information (current place, etc) into  somewhere storage. If you
want to manipulate them in Ruby-level, register task API with this
information.

-----

At first, I wanted to provide only GC related events invoking at
finalizing timing. However, this approach has several problems:
(1) Can't collect correct place (filename, line)
    If GC is at nested C methods, finalizer invoking timing is
    after retuning timing of C methods.
(2) It is difficult to determin how many free-ed objects can register to
delay ("somewhere storage" I mentioned above)

My proposal will solve them.


> Also implementation of newobj hook is tricky, because object klass/flags are set in the OBJSETUP macro.

Now, we have rb_newobj_of() function.

> An object tracing api will provide a lot of benefits (debuggers can track full C/ruby stacktrace of allocation site), but there are still some advantages to doing this in the VM directly:
> 
>   - gc.c can do much better job of storing object metadata efficiently (external statistics library will have to use hash table)

Yes. we need to make a comparison.
I think there are no big differences between VM-level and C-ext level.
Maybe it is too slow to use it in production. But no data to compare.

>   - if statistics library is loaded as cext gem, it cannot track objects already created (such as objects inside rubygems library)

I believe it is no problem because it can be solved requiring it at first.

> I would like to hear your idea, but I can wait for patch. Or if you tell me I can try to implement.

Ideas are above.

>> This movie shows the status of heaps. black pixel is free object. red
>>  pixel is string object, and so on.
> 
> This is very cool. Such visualizations make it much easier to understand GC behavior, so I am excited to see an official API to make allocation tooling easier.

Hehe. It was my hobby :)
It is easy using trace API (GCed hook) and rb_objspace_each_objects().

-- 
// SASADA Koichi at atdot dot net