On Mon, Jan 11, 2010 at 9:42 AM, Erik Scheirer <e / illume.org> wrote: > I think a pluggable/loadable GC scheme, as long as its really simple to use, is perfect. So would be a lasting world peace! > There would be some overhead created by making it pluggable, though, but in the scheme of things it would be well worth the small amount of cpu cycles lost. I just can't see the feasibility of this. Any good GC involves careful interaction between the parts of the system which mutate memory and those which manage it. Getting a highly performant GC almost always involves careful coordinated design of things like: The low-level layout of objects. The division of memory into various collections of objects (e.g. in a GC scheme the old objects and the new object live in different spaces, sometimes the new space moves each time a minor GC happens. Efficient detection of whether an object is old or new. For a GC requiring a 'write-barrier', efficient implementation of that write barrier. ... And to really get the most out of a GC, some of the low level decisions can be platform and processor specific. There are cascading design decisions to be be made. For example, lets say we're making a generation scavenging GC. We need to capture enough information as the mutator (the program) runs so that we can find any new objects which are referenced by an old object. This is the reason for the write barrier. So there are several issues: How do we detect a store of a reference to a new object into an old object with the lowest overhead. How do we remember a store into an old object with the lowest overhead. ... There are several strategies for detecting old vs new objects, each with it's own tradeoffs, for example: A flag bit in the object header Address range checking to see which space it's in, or not in. On some platforms and processors, one might make use of the virtual memory hardware and access privileges to detect such stores, but this is highly non-portable and may or may not outperform other approaches. Flag bits need to be maintained properly, and are expensive, see below. Address range checking is more common, and goes back to the interactions with the overall design of the "VM". And what about how to remember the old objects which need to be considered during a new object GC. We could perhaps make a linked or set of "remembered" objects, but this is expensive both in terms of space and speed. Most GCs use some form of "card marking" where old space is broken up in to 'cards' containing a range of memory. Cards are similar to pages in a virtual memory system, and may or may not be the same in a particular GC implementation. In such a scheme when a new object reference is stored in an old object, the fact that has happened is stored as a change to the card in which the old object resides. The most obvious way to do this is to have a data structure somewhere which has a bit for each card. But on most processors setting individual bits is expensive involving fetching, masking, and re-storing a larger datatype. The Self guys recognized this and found that for the processors they were working on using a byte rather than a bit for the mark, was much better overall despite requiring eight times the space for the marks. http://www.cs.ucsb.edu/~urs/oocsb/papers/write-barrier.pdf And these are just some of the variations once one has chosen a particular GC algorithm or perhaps one of a family of GC algorithms. Now I know that LLVM attempts to do something like this, http://llvm.org/docs/GarbageCollection.html but it apparently hasn't been all that successful: http://lhc-compiler.blogspot.com/2009/01/why-llvm-probably-wont-replace-c.html The problem is that LLVM defines the interface between the mutator and the GC "framework" in terms of C functions and function callbacks, e.g. for the write-barrier, whereas a really efficient GC implements the write barrier(and other GC bookkeeping tasks) in a few machine instructions. I fear that a pluggable GC would only let you play around with pretty poorly performing GC alternatives. -- Rick DeNatale Blog: http://talklikeaduck.denhaven2.com/ Twitter: http://twitter.com/RickDeNatale WWR: http://www.workingwithrails.com/person/9021-rick-denatale LinkedIn: http://www.linkedin.com/in/rickdenatale