Issue #16188 has been updated by Eregon (Benoit Daloze).


jeremyevans0 (Jeremy Evans) wrote:

Thank you for your detailed reply.

> Internally, there are not really performance implications for the choice to treat keyword arguments as a last positional hash.  If we not do so, the `foo(key: value)` call would be an ArgumentError.

I think there are. If args and kwargs are completely separated, we can choose to represent kwargs as we wish, and only need to build a Hash if the method takes a `**kwrest`.
But in the current approach, for a call site calling both a method taking kwargs and another not taking kwargs, we'll need logic to handle both the kwargs case and the conversion to a last positional Hash. If they were completely separated, the positional Hash would just be an ArgumentError and we wouldn't need that logic at all.
I agree that would be too incompatible though, so let's focus on optimizing the semantics in Ruby 2.7 and for Ruby 3.

> I don't believe we had to give up any optimizations in CRuby when I took mame's original branch and added backwards compatibility to treat keyword arguments as a last positional hash.

I think CRuby does not do many optimizations for keyword arguments except for the simplest case, I'd like to consider optimizations and JITs in general.
A few extra checks in MRI might not be noticeable if some form of kwargs (e.g., `**kwargs`) is generally slow, but could very well be significant on a more optimizing Ruby implementation.

> If you pass a literal hash to a method that accepts a keyword splat (where the literal hash is treated as keywords), I assume there still must be overhead compared to passing a literal hash to a method that it does not accept keyword arguments, as it must allocate a new hash.

`**kwrest` allocates conceptually, but the allocation does not need to happen, for instance if the Hash instance is not stored anywhere and escape analysis can figure it out.
This is how keyword arguments are currently optimized in TruffleRuby: we always create the Hash, but the escape analysis in the JIT figures it does not need to actually allocate it in many cases.

> I think passing a literal hash to a method that accepts explicit keywords but no keyword hash can be allocation-less.  I'm not sure if the JIT engine change things in this area, maybe k0kobun knows.

When I mentioned JIT, I was thinking specifically to TruffleRuby, JRuby and highly-optimizing JITs.
I think MJIT doesn't optimize kwargs better than the interpreter currently.

> The general structure remains the same.  There are warnings added for methods that accept keyword arguments that will break in Ruby 3:

Right, what about Ruby 3?
We should aim something that is both semantically simpler and easier to optimize then.

> There are some behavior changes compared to 2.7:
> 
> * No need to split keyword hash (or last hash treated as keywords) if a keyword splast is accepted, as keyword splats accept non-Symbol keys in 2.7.

That should be good for performance, no more "crazy" splitting in argument handling and complicated checks for whether splitting should be tried :)

> * Passing an empty keyword splat to a method no longer passes an argument, unless this argument is a required positional argument, in which case it is warned.
> 
> This last point is the one behavior in CRuby 2.7 that has significant performance implications.  What happens is an empty hash argument is not passed, but the call is flagged that empty keyword hash was passed.  If the empty keyword hash turns out to be required to fulfill a positional argument, an empty hash is added back.  This requires allocating a temporary buffer to store the new argument array.

Right, so we need to pass an extra flag with the call for that case at least.

How are *args and **kwargs actually represented in MRI?
Is it like a flat array of "arguments", and the last one can be a keyword Hash, and there is some flag passed to the callee to differentiate a last positional Hash from keyword arguments?
Then methods taking kwargs need to check the flag to know how many positional args are passed, and whether kwargs are passed.
Methods not taking kwargs don't even need to check the flag, and can consider the entire flat array as positional arguments.

I guess cases like `foo(a: 1, b: 2)` are not represented with a Hash to avoid allocations in MRI, how does it look like then for such a case?
That makes the case for methods not taking kwargs quite more complicated, as they need to read the "passed optimized kwargs" flag and if set convert those optimized kwargs to a Hash, and add + 1 to the count of positional arguments.

> I don't think this is true in CRuby. It certainly could be true in other Ruby implementations.  However, I would think you could always treat things as passing keywords in the caller code, and just have the callee convert keywords to a positional hash if the method does not accept keywords.

Right, that sounds like the easiest approach.

> For methods that are flagged with `ruby2_keywords` (which can only happen if the method accepts a regular splat but no keywords or keyword splat), if the method is called with keywords, we set a flag so that the keyword hash is treated as a last positional hash with the keyword flag set.  Additionally, in this case the empty keyword hash is not removed, it is treated the same as a non-empty keyword hash.

That should be fine, because it means only `ruby2_keywords` flagged methods would have a little overhead for that extra logic (i.e., writing a flag or @ivar on the Hash).
Deciding to keep or remove the empty keyword hash case should itself be free once JITed, as it's a never-changing property of a method, if we make `ruby2_keywords` define a new method internally.
 
> For all method calls that use an argument splat and pass no keywords or keyword splat, if the last element of the argument splat array is a hash with the keyword flag set, that argument is treated as keywords instead of a positional hash.

That's the one I'm concerned about, it means every `foo(some arguments)` with `def foo(*args); bar(*args); end` will have to check if the last positional argument is a Hash with the flag set:
`args.size >= 1 && Hash === args[-1] && flagged?(args[-1])`. That's a quite a few branches and it's not just reading one bit from memory, it's at least 4 memory loads (size, [-1], .class, .flag).
I think this is another motivation for having `ruby2_keywords` only in Ruby 2.7, as it's clearly not free for performance.

We could skip that if we know currently no method uses `ruby2_keywords`, but as soon as it's used we have to perform the check for every method call with the method taking a rest argument (and no explicit kwargs).
I would think there are quite a few methods taking a rest argument and no explicit kwargs, i.e. `def foo(*args)`, `def foo(req, *args)`, `def foo(opt=default, *args)`, etc all need the extra check
but `def foo(*args, **kwargs)`/`def foo(*args, kwarg:)`/`def foo(*args, kwarg: default)` do not need the check.

Maybe we should be more explicit about when to convert a flagged Hash to keyword arguments.
For instance we could have, using ActionDispatch::MiddlewareStack::Middleware's example:
```ruby
class Middleware
  attr_reader :args, :block, :klass

  def initialize(klass, args, block)
    @klass = klass
    @args  = args
    @block = block
  end
  ruby2_keywords :initialize if respond_to?(:ruby2_keywords, true)

  # Current code
  def build(app)
    klass.new(app, *args, &block)
  end

  # idea to be explicit when converting to keyword args
  # and avoid extra checks in other call to methods with a *rest argument.
  def build(app)
    klass.send_keyword_hash(:new, app, *args, &block)
  end
end
```

That would limit that complicated check to only `send_pass_kwargs` calls, and not any other calls.
It's also more explicit and less magic, which might be easier to understand and explain.
I'm unsure about a good name, maybe `send_pass_kwargs`, `send_pass_flagged_kwargs`, `send_pass_flagged_hash`, `send_flagged_hash_as_kwargs`, `send_kwhash`, `send_keyword_hash`?
What do you think?

----------------------------------------
Misc #16188: What are the performance implications of the new keyword arguments in 2.7 and 3.0?
https://bugs.ruby-lang.org/issues/16188#change-82036

* Author: Eregon (Benoit Daloze)
* Status: Open
* Priority: Normal
* Assignee: jeremyevans0 (Jeremy Evans)
----------------------------------------
In #14183, keyword arguments became further separated from positional arguments.

Contrary to the original design though, keyword and positional arguments are not fully separated for methods not accepting keyword arguments.
Example: `foo(key: :value)` will `def foo(hash)` will pass a positional argument.
This is of course better for compatibility, but I wonder what are the performance implications.

The block argument is completely separate in all versions, so no need to concern ourselves about that.

In Ruby <= 2.6:
* The caller never needs to know about the callee's arguments, it can just take all arguments and pass them as an array.
  The last argument might be used to extract keyword, but this is all done at the callee side.
* Splitting kwargs composed of Symbol and non-Symbol keys can be fairly expensive, but it is a rare occurrence.
  If inlining the callee and kwargs are all passed as a literal Hash at the call site, there shouldn't be any overhead compared to positional arguments once JIT'ed.

In Ruby 2.7:
* The caller needs to pass positional and keyword arguments separately, at least when calling a method accepting kwargs.
  But, if it calls a methods not accepting kwargs, then the "kwargs" (e.g. `foo(key: :value)`) should be treated just like a final Hash positional argument.
* (If we had complete separation, then we could always pass positional and keyword arguments separately, so the caller could once again ignore the callee)

How is the logic implemented in MRI for 2.7?

Specializing the caller for a given callee is a well-known technique.
However, it becomes more difficult if different methods are called from the same callsite (polymorphic call), especially if one accepts kwargs and another does not.
In that case, I think we will see a performance cost to this approach, by having to pass arguments differently based on the method to be called.

What about delegation using `ruby2_keywords`?
Which checks does that add (compared to 2.6) in the merged approach with the Hash flag?



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>