On Tue, Jan 12, 2010 at 5:23 AM, Brent Roman <brent / mbari.org> wrote: > > This paper and excellent slide set seem very promising: > > http://matsu-www.is.titech.ac.jp/~endo/papers/endo-ismm2002-gc.pdf > http://matsu-www.is.titech.ac.jp/~endo/papers/endo-ismm2002-gc.ppt > > Toshio Endo and Kenjiro Taura adapted the Boehm conservative GC > to *dramatically* reduce pauses with incremental collection. ¨Âèåéò Çà áääó > about 10% to execution times on a single core CPU, but improves > execution times over basic mark-and-sweep > on multi-core CPU -- by allowing marking to run > in parallel on multiple cores. > > This GC seems like a good fit to me. ¨Âéîãå íïóô æïìëòõîîéîÒõâù áìòåáä> use multi-core processors, it should be an all around performance "win". > > Has anyone looked into grafting this GC into MRI? > > Does anyone see any conceptual problems in doing so? Well, I've just scanned the paper, but I have a few comments. 1) The algorithm they present relies on hooking into the virtual memory implementation in order to implement a write barrier by write protecting pages, catching the exception caused when a store is done into a protected page, marking the page as dirty, and then un-write protecting the page. I'm not sure how portable this is across various combinations of operating system and processor. 2) IMHO, conservative GCs are compromises which attempt to provide GC to languages without a strong 'object model'. And without help from the compiler. By object model here, I mean more than just how objects are laid out, or how methods are found, but a design which is factored so that at the least references to objects vs. non-objects can be distinguished, and reference changes can be efficiently tracked, usually with some support from the compiler, and/or implementation of the 'byte-codes'. There's no reason why Ruby couldn't be implemented with such an object model, in fact I don't think that it's far from that. There are decades of experience in implementing VMs for such languages which provide efficient accurate garbage collection, while also providing at least, and usually two, interfaces for invoking code not generated by the languages compiler: 1) An interface for invoking procedural library code which is totally unaware which language invoked it. Such code takes a list of parameters and returns result value(s). This code need have no interaction with the GC, the interface insures that object references don't pass across the interface, all the parameters get converted to and result values get converted from basic data types (ints, strings, structs ...). This is what the FFI does. 2) An interface for using a low-level language for writing 'primitives' which are aware that they need to interact with the VM and the garbage collector. This requires that the interface provides macros and/or function calls/call backs and a set of rules to be followed in writing the primitives. The biggest problem with MRI, is that it started out using one api for both of these. I suspect that a large percentage of Ruby extensions are really cases of use case 1. If Ruby were to take a path of deprecating using the extension api for use case 1 in favor of FFI, and then evolving the extension API to support having objects move during GC (which the similar APIs for Smalltalk, Java and VM base languages have done), it could actually get a competitive VM/GC. -- Rick DeNatale Blog: http://talklikeaduck.denhaven2.com/ Twitter: http://twitter.com/RickDeNatale WWR: http://www.workingwithrails.com/person/9021-rick-denatale LinkedIn: http://www.linkedin.com/in/rickdenatale