Issue #8339 has been updated by headius (Charles Nutter).


I like the technique. I have some observations.

* Most of the benchmarks do not have enough old data to make a difference.

Small benchmarks, in particular, will not show anything useful. These benchmarks could be made more interesting by creating a large amount of old data first, and then running the benchmark a few times. If dealing with a benchmark that's all young data I can't imagine there's going to be any gain (and probably loss, instead).

* This seems like a nice segway toward more explicit reference management, as in JNI.

A "sunny" reference is rather like a downcall-local reference in JNI, where you guarantee (or else you are forced) to only hold a reference for the duration of the downcall. I could see adding APIs that would say "give me this reference, but only consider it "shady" until the downcall is done". Somewhat like explicit GIL release, but explicit "I won't use this reference outside this call". I'd really like to see that enter the C API, since it's a key reason why JNI libraries work well without limiting JVM GC implementation. A "global" reference in JNI would be like explicitly saying "I want a shady reference". This is, honestly, the way the Ruby C API needs to go to truly enable smart GC. What you have done with RGenGC is a halfway step, marking specific C API operations as grabbing "global" or "shady" references. We can make this more explicit, and it would be a good thing to do so.

For example, if I want to get access to an array's internals, I could say "I want a local reference to array internals". For the duration of that downcall, the array would be pinned (JVM does not guarantee this, but does guarantee that the reference is good for the downcall's lifetime). It would open up the possibility for C ext authors to use "shady" APIs in a limited scope.

...

Nice work either way. What you have done here may be applicable to moving-GC implementations that want to implement the Ruby C API, in that it would provide a way for us to map implicit C API operations to explicit VM operations without a lot of nasty weak references and such.
----------------------------------------
Feature #8339: Introducing Geneartional Garbage Collection for CRuby/MRI
https://bugs.ruby-lang.org/issues/8339#change-39031

Author: ko1 (Koichi Sasada)
Status: Open
Priority: Normal
Assignee: ko1 (Koichi Sasada)
Category: core
Target version: current: 2.1.0


|  One day a Rubyist came to Koichi and said, "I understand how to improve 
|  CRuby's performance. We must use a generational garbage collector." Koichi
|  patiently told the Rubyist the following story: "One day a Rubyist came 
|  to Koichi and said, 'I understand how to improve CRuby's performance..."
|  [This story is an homage of an introduction in a paper:
|   "A real-time garbage collector based on the lifetimes of objects"
|   (by Henry Lieberman, Carl Hewitt)
|   <http://dl.acm.org/citation.cfm?id=358147&CFID=321285546&CFTOKEN=10963356>]

We Heroku Matz team developed a new generational mark&sweep garbage
collection algorithm RGenGC for CRuby/MRI.
(correctly speaking, it is generational marking algorithm)

What goods are:

  * Reduce marking time (yay!)
  * My algorithm doesn't introduce any incompatibility into normal C-exts.
  * Easy to development

Please read more details in attached PDF file.
Code is: https://github.com/ko1/ruby/tree/rgengc

How about to introduce this new GC algorithm/implementation into Ruby 2.1.0?

Thanks,
Koichi



-- 
http://bugs.ruby-lang.org/