Issue #15711 has been updated by tenderlovemaking (Aaron Patterson).


> Doesn't it cause a large overhead to maintain the $id_to_obj map? (#15626)

I don't know if it's "large" exactly. But we only need to maintain the map if someone ever accesses "id", and that is rare.  Maybe not "never", but it's not a real world bottleneck.

> If there was no _id2ref, we'd just need an atomic increment for object_id, right?

I think MRI will require an atomic increment and a map always (at least until we can get variable width objects).  We don't have a place to store the id for the object, so it has to be stored in some kind of map, whether that is the instance variable table for an object, or a global table (which is what we have now).

> TruffleRuby implements _id2ref but it's very inefficient (basically search in ObjectSpace.each_object), and I don't think there is a reasonable way to make it efficient with a moving GC (the map overhead seems pretty high, both footprint and computation wise).

We maintain two maps, an "id to address" and an "address to id" map.  When compaction runs it just updates both of those maps.  In terms of time and space, it's certainly not free, but like I said I don't think people access an object id very frequently in the real world.

Also I'm totally happy if we get rid of id2ref.  But since you can't accidentally access random memory with id2ref, and calling `id` doesn't seem like a bottleneck, this just seems less urgent.

----------------------------------------
Bug #15711: Remove use of _id2ref from DRb
https://bugs.ruby-lang.org/issues/15711#change-85957

* Author: headius (Charles Nutter)
* Status: Closed
* Priority: Normal
* Assignee: seki (Masatoshi Seki)
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
This issue relates to https://bugs.ruby-lang.org/issues/15408

DRb uses `_idref` internally to implement a weak map, and this issue seeks to replace that code with an implementation that does not use `_id2ref`.

We will be deprecating `ObjectSpace._id2ref` in the near future since it fails to work like people expect (when implemented as a pointer address) or adds memory and invocation overhead to `object_id`.

An initial patch for this is provided by JRuby, which implements `object_id` using a monotonically-increasing value, and only allows `_id2ref` use with a command line flag.

https://github.com/ruby/ruby/compare/trunk...jruby:jruby-ruby_2_6_0#diff-e979bf2f831d9826629559b8628809e9

This implementation uses the stdlib `weakref` to implement a simple weak map, and it would be suitable as an implementation for now. However there's some inefficiency here because it has to periodically "clean" the hash of vacated references by scanning all entries.

There are two more efficient implementations that require additional work:

Alternate 1: Use `ObjectSpace::WeakMap`, which is an opaque VM-supported implementation of a weak Hash. Unfortunately I don't think `WeakMap` has ever been blessed as a public API, and since we're rapidly moving standard libraries to gems, it would not be appropriate to use an internal API. So, we can either make WeakMap an official part of the public standard API, or do alternate 2.

Alternate 2: Add weak reference queues to the weakref API, so users can implement their own efficient weak maps. Some of this has been discussed (at great length) in https://bugs.ruby-lang.org/issues/4168, and the JRuby team has supported the [weaklink](https://github.com/headius/weakling) gem for many years (which provides a WeakRef+RefQueue implementation for JRuby).

The original patch works well for small numbers of remoted objects.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>