Robert Klemme wrote:
> IMHO ObjectSpace should not be implemented in Java land.  Why?  The JVM 
> has to keep track of instances anyway and implementing this in Java via 
> WeakReferences seems to duplicate functionality that is already there. 
> Did you consider using "Java Virtual Machine Tools Interface"?
> 
> http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/gbmmt.html#gbmls
> 
> You could either follow the same approach of the heapTracker presented 
> on that page and use a flag or require a lib that enables ObjectSpace 
> (because of the overhead of instrumentation).

You just hit on exactly why we don't use JVMTI for ObjectSpace. It would 
certainly work, but it would add a lot of overhead we'd never expect 
people to accept in a real application. Plus, it would track far more 
object instances than we actually want tracked. We'd love to include a 
JVMTI-based ObjectSpace implementation, however...it just hasn't been a 
high priority to implement since 99% of users never actually need 
ObjectSpace.

> Alternatively there may be another method that does not need 
> instrumentation and that can give you access to every (reachable) object 
> in the JVM.

If there is...we haven't found it. The "linked weakref list" has been 
the least overhead so far, and it's still a lot of overhead.

>> Your idea has come up in the past, and it would probably eliminate the 
>> cost of an ObjectSpace list. However that doesn't appear to be where 
>> we pay the highest cost.
>>
>> The two items that (we believe) cost the most for us on the JVM are:
>>
>> - Constructing an extra object for every Ruby object...namely, the 
>> WeakReference object to point to it. So we pay a 
>> memory/allocation/initialization cost.
>> - WeakReference itself causes Java's GC to have to do additional 
>> checks, so it can notify the WeakReference that the object it points 
>> at has gone away. So that slows down the legendary HotSpot GC and we 
>> pay again.
>>
>> I believe the parent -> weakref -> children algorithm is used in some 
>> implementations of ObjectSpace-like behavior, so it's perfectly valid. 
>> But again, there's certain aspects of ObjectSpace that are just 
>> problematic...
>>
>> - threading or concurrency of any kind? No, you can't have 
>> multithreading with ObjectSpace, nor a concurrent/parallel GC (and it 
>> potentially excludes other advanced GC designs too).
>> - determinism? Matz told me that "ObjectSpace doesn't have to be 
>> deterministic"...but when it starts getting wired into libraries like 
>> test/unit, it seems like people expect it to be. If we can say OS 
>> isn't deterministic, then *nobody* should be relying in its contents 
>> for core libraries, and we could reasonably claim that each_object 
>> will never return *anything*.
> 
> I'd reformulate the requirement here: ObjectSpace.each_object must yield 
> every object that was existent before the invocation and that is 
> strongly reachable.  I believe for the typical use case (e.g. traversing 
> all class instances) this is enough while leaving enough flexibility for 
> the implementation (i.e. create s snapshot of some form, iterate through 
> some internal structure that may change due to new objects being created 
> during #each_object etc.).

The problem here is "strongly reachable". During ObjectSpace processing, 
the last strong reference to an object may go away and the garbage 
collector may run. Should ObjectSpace prevent GC from running if it's 
traversed and now references that object? If not, how should it be 
handled if immediately before you return an object from each_object, it 
gets garbage collected? There's no way to catch that, so each_object may 
end up returning a reference to an object that's gone away, or 
reconstituting an object whose finalization has already fired. Bad 
things happen.

ObjectSpace is just not compatible with any GC that requires the ability 
to move objects around in memory, run in parallel, and so on. It can 
*never* be deterministic unless it can "stop the world", so it should 
not be used for algorithms that require any level of determinism, such 
as the test search in test/unit.

- Charlie