Issue #16038 has been updated by matz (Yukihiro Matsumoto).


I am not sure if the proposal has real-world use-case. Can you elaborate?

Matz.


----------------------------------------
Feature #16038: Provide a public WeakMap that compares by equality rather than by identity 
https://bugs.ruby-lang.org/issues/16038#change-81250

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
I know `ObjectSpace::WeakMap` isn't really supposed to be used, and that the blessed interface is `WeakRef`. However, I'd like to make a case for a better public WeakMap.

### Usage

As described in [Feature #16035], `WeakMap` is useful for deduplicating "value objects". A typical use case is as follows:

```ruby
class Position
  REGISTRY = {}
  private_constant :REGISTRY

  class << self
    def new(*)
      instance = super
      REGISTRY[instance] ||= instance
    end
  end

  attr_reader :x, :y, :z

  def initialize(x, y, z)
    @x = x
    @y = y
    @z = z
    freeze
  end

  def hash
    self.class.hash ^
      x.hash >> 1 ^
      y.hash >> 2 ^
      y.hash >> 3
  end

  def ==(other)
    other.is_a?(Position) &&
      other.x == x &&
      other.y == y &&
      other.z == z
  end
  alias_method :eql?, :==
end

p Position.new(1, 2, 3).equal?(Position.new(1, 2, 3))
```

That's pretty much the pattern [I used in Rails to deduplicate database metadata and save lots of memory](https://github.com/rails/rails/blob/f3c68c59ed57302ca54f4dfad0e91dbff426962d/activerecord/lib/active_record/connection_adapters/deduplicable.rb).

The big downside here is that these value objects can't be GCed anymore, so this pattern is not viable in many case.

### Why not use WeakRef

A couple of reasons. First, when using this pattern, the goal is to reduce memory usage, so having one extra `WeakRef` for every single value object is a bit counter productive. 

Then it's a bit annoying to work with, as you have to constantly check wether the reference is still alive, and/or rescue `WeakRef::RefError`.

Often, these two complications make the tradeoff not worth it.

### Ruby 2.7

Since [Feature #13498] `WeakMap` is a bit more usable as you can now use an interned string as the unique key, e.g.

```ruby
class Position
  REGISTRY = ObjectSpace::WeakMap.new
  private_constant :REGISTRY

  class << self
    def new(*)
      instance = super
      REGISTRY[instance.unique_id] ||= instance
    end
  end

  attr_reader :x, :y, :z, :unique_id

  def initialize(x, y, z)
    @x = x
    @y = y
    @z = z
    @unique_id = -"#{self.class}-#{x},#{y},#{z}"
    freeze
  end

  def hash
    self.class.hash ^
      x.hash >> 1 ^
      y.hash >> 2 ^
      y.hash >> 3
  end

  def ==(other)
    other.is_a?(Position) &&
      other.x == x &&
      other.y == y &&
      other.z == z
  end
  alias_method :eql?, :==
end

p Position.new(1, 2, 3).equal?(Position.new(1, 2, 3))
```

That makes the pattern much easier to work with than dealing with `WeakRef`, but there is still that an extra instance.

### Proposal

What would be ideal would be a `WeakMap` that works by equality, so that the first snippet could simply replace `{}` by `WeakMap.new`. 

Changing `ObjectSpace::WeakMap`'s behavior would cause issues, and I see two possibilities:

  - The best IMO would be to have a new top level `::WeakMap` be the equality based map, and have `ObjectSpace::WeakMap` remain as a semi-private interface for backing up `WeakRef`.
  - Or alternatively, `ObjectSpace::WeakMap` could have a `compare_by_equality` method (inverse of `Hash#compare_by_identity`) to change its behavior post instantiation.

I personally prefer the first one.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>