Issue #16897 has been updated by jeremyevans0 (Jeremy Evans).


 As @Eregon mentioned, the `***a` approach is likely to be the same speed o=
r slower than `*args, **kw` approach on CRuby as it has to allocate at leas=
t as many objects.  It could be theoretically possible to increase performa=
nce by reducing allocations in the following cases:
 =

* 0-3 arguments with no keywords
* 0-1 arguments with 1 keyword

You could do this by storing the arguments inside the object, similar to ho=
w arrays and hashes are optimized internally.  Those cases would allow you =
to get by with a single object allocation instead of allocating two objects=
 (array+hash), assuming that the caller side is not doing any object alloca=
tion.  All other cases would be as slow or slower.

This approach would only be faster if you never needed to access the argume=
nts or keywords passed.  As soon as you need access to the arguments or key=
words, it would likely be slower as it would have to allocate an array or h=
ash for them.  This limits the usefulness of the approach to specific cases.

When you compare `***a` to `ruby2_keywords`, which is currently the fastest=
 approach, the cases where it could theoretically be faster I believe are l=
imited to 0-1 arguments with 1 keyword.

This approach will increase complexity in an already complex system.  It wo=
uld be significant undertaking to implement, and it's not clear it would pr=
ovide a net performance improvement.

It is true that supporting `ruby2_keywords` makes `*args` calls without key=
words slower.  I think the maximum slowdown was around 10%, and that was wh=
en the callee did not accept a splat or keywords. When the callee accepted =
a splat or keywords, I think the slowdown was around 1%.  However, as `ruby=
2_keywords` greatly speeds up delegation (see below), `ruby2_keywords` resu=
lts in a net increase in performance in the majority of cases.  Until `ruby=
2_keywords` no longer results in a net increase in performance in the major=
ity of cases, I believe it should stay.

Here's a benchmark showing a 160% improvement in delegation performance in =
master by using `ruby2_keywords` instead of `*args, **kw`:

```ruby
def m1(arg) end
def m2(*args) end
def m3(arg, k: 1) end
def m4(*args, k: 1) end
def m5(arg, **kw) end
def m6(*args, **kw) end

ruby2_keywords def d1(*args)
  m2(*args);m2(*args);m2(*args);m2(*args);m2(*args);
  m3(*args);m3(*args);m3(*args);m3(*args);m3(*args);
  m4(*args);m4(*args);m4(*args);m4(*args);m4(*args);
  m5(*args);m5(*args);m5(*args);m5(*args);m5(*args);
  m6(*args);m6(*args);m6(*args);m6(*args);m6(*args);
end
ruby2_keywords def d1a(*args)
  m1(*args);m1(*args);m1(*args);m1(*args);m1(*args);
end

def d2(*args, **kw)
  m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);m2(*args,=
 **kw);
  m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);m3(*args,=
 **kw);
  m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);m4(*args,=
 **kw);
  m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);m5(*args,=
 **kw);
  m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);m6(*args,=
 **kw);
end
def d2a(*args, **kw)
  m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);m1(*args,=
 **kw);
end

require 'benchmark'

print "ruby2_keywords: "
puts(Benchmark.measure do
  100000.times do
    d1a(1)
    d1(1, k: 1)
  end
end)

print "   *args, **kw: "
puts(Benchmark.measure do
  100000.times do
    d2a(1)
    d2(1, k: 1)
  end
end)
```

Results:

```
ruby2_keywords:   1.350000   0.000000   1.350000 (  1.395517)
   *args, **kw:   3.630000   0.000000   3.630000 (  3.693702)
```

----------------------------------------
Feature #16897: Can a Ruby 3.0 compatible general purpose memoizer be writt=
en in such a way that it matches Ruby 2 performance?
https://bugs.ruby-lang.org/issues/16897#change-86000

* Author: sam.saffron (Sam Saffron)
* Status: Open
* Priority: Normal
----------------------------------------
```ruby
require 'benchmark/ips'

module Memoizer
def memoize_26(method_name)
  cache =3D {}

  uncached =3D "#{method_name}_without_cache"
  alias_method uncached, method_name

  define_method(method_name) do |*arguments|
    found =3D true
    data =3D cache.fetch(arguments) { found =3D false }
    unless found
      cache[arguments] =3D data =3D public_send(uncached, *arguments)
    end
    data
  end
end

  def memoize_27(method_name)
    cache =3D {}

    uncached =3D "#{method_name}_without_cache"
    alias_method uncached, method_name

    define_method(method_name) do |*args, **kwargs|
      found =3D true
      all_args =3D [args, kwargs]
      data =3D cache.fetch(all_args) { found =3D false }
      unless found
        cache[all_args] =3D data =3D public_send(uncached, *args, **kwargs)
      end
      data
    end
  end

  def memoize_27_v2(method_name)
    uncached =3D "#{method_name}_without_cache"
    alias_method uncached, method_name

    cache =3D "MEMOIZE_#{method_name}"

    params =3D instance_method(method_name).parameters
    has_kwargs =3D params.any? {|t, name| "#{t}".start_with? "key"}
    has_args =3D params.any? {|t, name| !"#{t}".start_with? "key"}

    args =3D []

    args << "args" if has_args
    args << "kwargs" if has_kwargs

    args_text =3D args.map do |n|
      n =3D=3D "args" ? "*args" : "**kwargs"
    end.join(",")

    class_eval <<~RUBY
      #{cache} =3D {}
      def #{method_name}(#{args_text})
        found =3D true
        all_args =3D #{args.length =3D=3D=3D 2 ? "[args, kwargs]" : args[0]}
        data =3D #{cache}.fetch(all_args) { found =3D false }
        unless found
          #{cache}[all_args] =3D data =3D public_send(:#{uncached} #{args.e=
mpty? ? "" : ", #{args_text}"})
        end
        data
      end
    RUBY

  end

end

module Methods
  def args_only(a, b)
    sleep 0.1
    "#{a} #{b}"
  end

  def kwargs_only(a:, b: nil)
    sleep 0.1
    "#{a} #{b}"
  end

  def args_and_kwargs(a, b:)
    sleep 0.1
    "#{a} #{b}"
  end
end

class OldMethod
  extend Memoizer
  include Methods

  memoize_26 :args_and_kwargs
  memoize_26 :args_only
  memoize_26 :kwargs_only
end

class NewMethod
  extend Memoizer
  include Methods

  memoize_27 :args_and_kwargs
  memoize_27 :args_only
  memoize_27 :kwargs_only
end

class OptimizedMethod
  extend Memoizer
  include Methods

  memoize_27_v2 :args_and_kwargs
  memoize_27_v2 :args_only
  memoize_27_v2 :kwargs_only
end

OptimizedMethod.new.args_only(1,2)


methods =3D [
  OldMethod.new,
  NewMethod.new,
  OptimizedMethod.new
]

Benchmark.ips do |x|
  x.warmup =3D 1
  x.time =3D 2

  methods.each do |m|
    x.report("#{m.class} args only") do |times|
      while times > 0
        m.args_only(10, b: 10)
        times -=3D 1
      end
    end

    x.report("#{m.class} kwargs only") do |times|
      while times > 0
        m.kwargs_only(a: 10, b: 10)
        times -=3D 1
      end
    end

    x.report("#{m.class} args and kwargs") do |times|
      while times > 0
        m.args_and_kwargs(10, b: 10)
        times -=3D 1
      end
    end
  end

  x.compare!
end


# # Ruby 2.6.5
# #
# OptimizedMethod args only:   974266.9 i/s
#  OldMethod args only:   949344.9 i/s - 1.03x  slower
# OldMethod args and kwargs:   945951.5 i/s - 1.03x  slower
# OptimizedMethod kwargs only:   939160.2 i/s - 1.04x  slower
# OldMethod kwargs only:   868229.3 i/s - 1.12x  slower
# OptimizedMethod args and kwargs:   751797.0 i/s - 1.30x  slower
#  NewMethod args only:   730594.4 i/s - 1.33x  slower
# NewMethod args and kwargs:   727300.5 i/s - 1.34x  slower
# NewMethod kwargs only:   665003.8 i/s - 1.47x  slower
#
# #
# # Ruby 2.7.1
#
# OptimizedMethod kwargs only:  1021707.6 i/s
# OptimizedMethod args only:   955694.6 i/s - 1.07x  (=B1 0.00) slower
# OldMethod args and kwargs:   940911.3 i/s - 1.09x  (=B1 0.00) slower
#  OldMethod args only:   930446.1 i/s - 1.10x  (=B1 0.00) slower
# OldMethod kwargs only:   858238.5 i/s - 1.19x  (=B1 0.00) slower
# OptimizedMethod args and kwargs:   773773.5 i/s - 1.32x  (=B1 0.00) slower
# NewMethod args and kwargs:   772653.3 i/s - 1.32x  (=B1 0.00) slower
#  NewMethod args only:   771253.2 i/s - 1.32x  (=B1 0.00) slower
# NewMethod kwargs only:   700604.1 i/s - 1.46x  (=B1 0.00) slower
```

The bottom line is that a generic delegator often needs to make use of all =
the arguments provided to a method.

```ruby
def count(*args, **kwargs)
  counter[[args, kwargs]] +=3D 1
  orig_count(*args, **kwargs)
end
```

The old pattern meant we could get away with one less array allocation per:

```ruby
def count(*args)
  counter[args] +=3D 1
  orig_count(*args, **kwargs)
end
```

I would like to propose some changes to Ruby 3 to allow to recover this per=
formance. =


Perhaps:

```ruby
def count(...)
  args =3D ...
  counter[args] +=3D 1
  orig_count(...)
end
```

Or:

```ruby
def count(***args)

  counter[args] +=3D 1
  orig_count(***args)
end
```

Thoughts? =




-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>