Issue #6615 has been updated by drbrain (Eric Hodel).


Since the last benchmark is so close, this benchmark uses the same deflate file set, but adds GVL contention by deflating the set across four threads.  Since no buffer expansion occurs this will benchmark cost of GVL release vs rb_thread_schedule() in present Zlib code.

Code:

  require 'zlib'
  require 'benchmark'
  
  r = Random.new 0
  
  file_count = 100_000
  
  deflated = (0..file_count).map do
    input = r.bytes 1000
    Zlib::Deflate.deflate input
  end
  
  times = Benchmark.measure do
    (0..3).map do
      Thread.new do
        deflated.each do |input|
          Zlib::Inflate.inflate input
        end
      end
    end.each do |t|
      t.join
    end
  end
  
  puts times

Without patch:

$ for f in `jot 5`; do ruby20 test.rb; done
  5.420000   5.970000  11.390000 (  8.162893)
  5.400000   6.270000  11.670000 (  8.263046)
  5.460000   5.920000  11.380000 (  8.133742)
  5.410000   6.290000  11.700000 (  8.289913)
  5.500000   6.620000  12.120000 (  8.478085)

With patch:

$ for f in `jot 5`; do make runruby; done
  5.120000   6.240000  11.360000 (  8.039715)
  5.240000   6.260000  11.500000 (  8.097961)
  5.280000   5.940000  11.220000 (  8.004246)
  5.210000   6.360000  11.570000 (  8.171124)
  5.240000   6.200000  11.440000 (  8.054929)

Again, slight improvement, but not as great as the video.

The inflate function is able to operate in parallel a small fraction of the time which is able to make up for the cost of GVL release/acquire.
----------------------------------------
Feature #6615: Release GVL in zlib when calling inflate() or deflate()
https://bugs.ruby-lang.org/issues/6615#change-27500

Author: drbrain (Eric Hodel)
Status: Open
Priority: Normal
Assignee: 
Category: ext
Target version: 2.0.0


This patch switches from zstream_run from using rb_thread_schedule() to rb_thread_blocking_region().

I don't see a way to safely interrupt deflate() or inflate() so the unblocking function is empty.

This patch should allow use of output buffer sizes larger than 16KB.  I suspect 16KB was chosen to allow reasonable context-switching time for ruby 1.8 and earlier.  A larger buffer size would reduce GVL contention when processing large streams.

An alternate way to reduce GVL contention would be to move zstream_run's loop outside the GVL, but some manual allocation would be required as currently the loop uses a ruby String as the output buffer.


-- 
http://bugs.ruby-lang.org/