Issue #15408 has been updated by Eregon (Benoit Daloze).


mrkn (Kenta Murata) wrote:
> Rather, in pycall, I use `object_id` and `_id2ref` for managing wrappers of Python classes and Python modules in Ruby.
> The reason why I use them is I want to reuse existing wrappers, while I don't want to protect the wrappers from garbage collection.
> For this purpose, pycall has mappings from a Python object pointer to an ID of Ruby object, which is a wrapper of the Python object.
> The existing WeakRef can be used for this purpose, but I didn't want to use it because I want to reduce the frequency of object generation.

But then this is exposed to the _id2ref bug, i.e., it could return another wrapper if the wrapper is GC'd and a new wrapper gets the same address.
Then that would probably be a very serious bug exposed to pycall users.

----------------------------------------
Feature #15408: Deprecate object_id and _id2ref
https://bugs.ruby-lang.org/issues/15408#change-75673

* Author: headius (Charles Nutter)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
Ruby currently provides the object_id method to get a "identifier" for a given object. According to the documentation, this ID is the same for every object_id call against a given object, and guaranteed not to be the same as any other active (i.e. alive) object. However, no guarantee is made about the ID being reused for a future object after the original has been garbage collected.

As a result, object_id can't be used to uniquely identify any object that might be garbage collected, since that ID may be associated with a completely different object in the future.

Ruby also provides a method to go from an object_id to the object reference itself: ObjectSpace._id2ref. This method has been in Ruby for decades and is often used to implement a weak hashmap from ID to reference, since holding the ID will not keep the object alive. However due to the problems with object_id not actually being unique, it's possible for _id2ref to return a different object than originally had that ID as object slots are reused in the heap.

The only way to implement object_id safely (with idempotency guarantees) would be to assign to all objects a monotonically-increasing ID. Alternatively, this ID could be assigned lazily only for those objects on which the code calls object_id. JRuby implements object_id in this way currently.

The only way to implement _id2ref safely would be to have a mapping in memory from those monotonically-increasing IDs to the actual objects. This would have to be a weak mapping to prevent the objects from being garbage collected. JRuby currently only supports _id2ref via a flag, since the additional overhead of weakly tracking every requested object_id is extremely high. An alternative for MRI would be to implement _id2ref as a heap scan, as it is implemented in Rubinius. This would make it entirely unpractical due to the cost of scanning the heap for every ID lookup.

I propose that both methods should immediately be deprecated for removal in Ruby 3.0.

* They do not do what people expect.
* They cannot reliably do what they claim to do.
* They eventually lead to difficult-to-diagnose bugs in every possible use case.

Put simply, both methods have always been broken in MRI and making them unbroken would render them useless.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>