Issue #15408 has been updated by headius (Charles Nutter).


> I'm unsure how to garbage-collect the wrapped objects, though

Oh this leads to another item Ruby really needs to add, related to the _id2ref removal: reference queues.

On the JVM, when you create a WeakReference, you can register it with a ReferenceQueue. When the object associated with the WeakReference is collected, the WeakReference is emptied and pushed onto the ReferenceQueue (by the GC). Later on, or in another thread, you can pull from that queue to clean up resources like wrappers, native pointers, or Hash entries.

Without ReferenceQueue, Ruby has no efficient way of cleaning up evacuated WeakRef objects (you have to scan for empty ones). I fixed this for JRuby in the `weakling` gem by exposing the ReferenceQueue implementation of the JVM: https://github.com/headius/weakling/blob/master/ext/org/jruby/ext/RefQueueLibrary.java#L56

With proper weak references and a reference queue, *anyone* can implement WeakMap efficiently on their own, and that covers most uses of _id22ref. We should still ship an official supported WeakMap, though.

----------------------------------------
Feature #15408: Deprecate object_id and _id2ref
https://bugs.ruby-lang.org/issues/15408#change-75670

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
Ruby currently provides the object_id method to get a "identifier" for a given object. According to the documentation, this ID is the same for every object_id call against a given object, and guaranteed not to be the same as any other active (i.e. alive) object. However, no guarantee is made about the ID being reused for a future object after the original has been garbage collected.

As a result, object_id can't be used to uniquely identify any object that might be garbage collected, since that ID may be associated with a completely different object in the future.

Ruby also provides a method to go from an object_id to the object reference itself: ObjectSpace._id2ref. This method has been in Ruby for decades and is often used to implement a weak hashmap from ID to reference, since holding the ID will not keep the object alive. However due to the problems with object_id not actually being unique, it's possible for _id2ref to return a different object than originally had that ID as object slots are reused in the heap.

The only way to implement object_id safely (with idempotency guarantees) would be to assign to all objects a monotonically-increasing ID. Alternatively, this ID could be assigned lazily only for those objects on which the code calls object_id. JRuby implements object_id in this way currently.

The only way to implement _id2ref safely would be to have a mapping in memory from those monotonically-increasing IDs to the actual objects. This would have to be a weak mapping to prevent the objects from being garbage collected. JRuby currently only supports _id2ref via a flag, since the additional overhead of weakly tracking every requested object_id is extremely high. An alternative for MRI would be to implement _id2ref as a heap scan, as it is implemented in Rubinius. This would make it entirely unpractical due to the cost of scanning the heap for every ID lookup.

I propose that both methods should immediately be deprecated for removal in Ruby 3.0.

* They do not do what people expect.
* They cannot reliably do what they claim to do.
* They eventually lead to difficult-to-diagnose bugs in every possible use case.

Put simply, both methods have always been broken in MRI and making them unbroken would render them useless.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>