Issue #14045 has been updated by myronmarston (Myron Marston).


This change introduces a bug in RSpec.  I'm working on a work around for RSpec (and hope to cut a release with a fix soon) but users running Ruby 2.5 with an older RSpec version will be affected, and the slight change in semantics introduced by this change might create bugs in other libraries and applications as well.

In RSpec, we've defined a `Hook` struct  (which stores a block plus some associated metadata), and depend upon `==` working properly to compare two `Hook` instances.  The fact that the proc is now initialized lazily is causing problems because what is fundamentally the same block can wind up with two different proc instances, whereas it had only one before.  This causes two `Hook` instances which were equal before to no longer be considered equal.  Here's a script that demonstrates the regression:

``` ruby
# regression.rb
def return_proc(&block)
  block
end

def return_procs(&block)
  block.inspect if ENV['INSPECT_BLOCK']

  proc_1 = return_proc(&block)
  proc_2 = return_proc(&block)

  return proc_1, proc_2
end

proc_1, proc_2 = return_procs { }

puts RUBY_VERSION
puts "Proc equality: #{proc_1 == proc_2}"
```

Here's the output on Ruby 2.4 vs Ruby 2.5:

```
$ chruby 2.4
$ ruby regression.rb
2.4.2
Proc equality: true
$ chruby 2.5
$ ruby regression.rb
2.5.0
Proc equality: false
$ INSPECT_BLOCK=1 ruby regression.rb
2.5.0
Proc equality: true
```

As the output shows, the two procs were equal on 2.4 but are no longer equal on 2.5.  However, if we call a method on the block (such as inspecting it), it defeats the lazy initialization and allows them to still be equal.

As I said, I'm working on addressing this change in RSpec, and while I can fairly easily fix the tests that fail as a result of this, I'm concerned that there might be other bugs this introduces that are not caught by our test suite.

Is there a way to keep this feature w/o introducing this regression?  If not, it might be worth considering reverting it since it does introduce a regression, and a very subtle one at that.

Thanks!

----------------------------------------
Feature #14045: Lazy Proc allocation for block parameters
https://bugs.ruby-lang.org/issues/14045#change-69117

* Author: ko1 (Koichi Sasada)
* Status: Closed
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Target version: 
----------------------------------------
# Background

If we need to pass given block, we need to capture by block parameter as a Proc object and pass it parameter as a block argument. Like that:

```
def block_yield
  yield
end

def block_pass &b
  # do something
  block_yield(&b)
end
```

There are no way to pass given blocks to other methods without using block parameters.

One problem of this technique is performance. `Proc` creation is one of heavyweight operation because we need to store all of local variables (represented by Env objects in MRI internal) to heap. If block parameter is declared as one of method parameter, we need to make a new `Proc` object for the block parameter.

# Proposal: Lazy Proc allocation for 

To avoid this overhead, I propose lazy Proc creation for block parameters.

Ideas:

* At the beginning of method, a block parameter is `nil`
* If block parameter is accessed, then create a `Proc` object by given block.
* If we pass the block parameter to other methods like `block_yield(&b)` then don't make a `Proc`, but pass given block information.

We don't optimize `b.call` type block invocations. If we call block with `b.call`, then create `Proc` object.We need to hack more because `Proc#call` is different from `yield` statement (especially they can change `$SAFE`).

# Evaluation

```
def iter_yield
  yield
end

def iter_pass &b
  iter_yield(&b)
end

def iter_yield_bp &b
  yield
end

def iter_call &b
  b.call
end

N = 10_000_000 # 10M

require 'benchmark'
Benchmark.bmbm(10){|x|
  x.report("yield"){
    N.times{
      iter_yield{}
    }
  }
  x.report("yield_bp"){
    N.times{
      iter_yield_bp{}
    }
  }
  x.report("yield_pass"){
    N.times{
      iter_pass{}
    }
  }
  x.report("send_pass"){
    N.times{
      send(:iter_pass){}
    }
  }
  x.report("call"){
    N.times{
      iter_call{}
    }
  }
}

__END__

ruby 2.5.0dev (2017-10-24 trunk 60392) [x86_64-linux]
                 user     system      total        real
yield        0.634891   0.000000   0.634891 (  0.634518)
yield_bp     2.770929   0.000008   2.770937 (  2.769743)
yield_pass   3.047114   0.000000   3.047114 (  3.046895)
send_pass    3.322597   0.000002   3.322599 (  3.323657)
call         3.144668   0.000000   3.144668 (  3.143812)

modified
                 user     system      total        real
yield        0.582620   0.000000   0.582620 (  0.582526)
yield_bp     0.731068   0.000000   0.731068 (  0.730315)
yield_pass   0.926866   0.000000   0.926866 (  0.926902)
send_pass    1.110110   0.000000   1.110110 (  1.109579)
call         2.891364   0.000000   2.891364 (  2.890716)
```

# Related work

To delegate the given block to other methods, Single `&` block parameter had been proposed (https://bugs.ruby-lang.org/issues/3447#note-18) (using like: `def foo(&); bar(&); end`). This idea is straightforward to represent `block passing`. Also we don't need to name a block parameter.

The advantage of this ticket proposal is we don't change any syntax. We can write compatible code for past versions.

Thanks,
Koichi



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>