Bug #1696: http downloads are unuseably slow
http://redmine.ruby-lang.org/issues/show/1696

Author: Steven Hartland
Status: Open, Priority: High
ruby -v: ruby 1.8.6 (2008-08-11 patchlevel 287) [amd64-freebsd7]

I've been using a a bit of code which use open-uri and http urls to download files and have found this is unuseably slow.
Example:

require 'open-uri'
open( <http uri> )

I did some benchmarking and the download of a 157MB file from the local machine was taking over 20 seconds and using 100% where as wget for the same file only took 0.7seconds.

Digging some more and profiling the code with RubyProf revealed that during the execution of the download over 11,000 threads where being created. This I've tracked down to the net/protocol module and the Net::BufferIO::rbuf_fill method which is using a timeout block to wrap the @io.sysread(1024) call. This is clearly an extremely bad way to do this, totally crippling the performance.

Below are benchmarks and some profiles of this issue and various quick fixes I tested here
== 1.8.6 core ==
      user     system      total        real
open 15.664062   6.164062  21.828125 ( 22.810859)

 %self     total     self     wait    child    calls  name
 14.04      5.27     3.00     2.27     0.00   109426  <Class::Thread>#start (ruby_runtime:0}
 10.97     13.14     2.34     0.00    10.80   109427  <Module::Timeout>#timeout (/usr/local/lib/ruby/1.8/timeout.rb:52}
  6.65      2.52     1.42     1.10     0.00   109426  Thread#kill (ruby_runtime:0}
  5.78     21.36     1.23     0.00    20.12        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  5.27     14.27     1.12     0.00    13.14   109427  Object#timeout (/usr/local/lib/ruby/1.8/timeout.rb:92}
  5.12      1.11     1.09     0.02     0.00   109426  IO#sysread (ruby_runtime:0}
  4.75      2.61     1.02     0.00     1.59   109426  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  3.40      3.62     0.73     0.00     2.89   109426  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:380}
  2.52      0.54     0.54     0.00     0.00   109435  String#slice! (ruby_runtime:0}
  2.30      1.33     0.49     0.00     0.84   109415  #<Class:0x801ac6f30>#<< ((eval):1}
  2.08      0.50     0.45     0.00     0.05   109427  Class#initialize (ruby_runtime:0}
  2.08      0.98     0.45     0.00     0.54   109435  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:137}
  2.01      4.05     0.43     0.00     3.62   109426  Net::ReadAdapter#<< (/usr/local/lib/ruby/1.8/net/protocol.rb:371}
  1.94      0.41     0.41     0.00     0.00   109416  IO#write (ruby_runtime:0}
  1.72      1.03     0.37     0.00     0.66   109435  Class#new (ruby_runtime:0}
  1.50      0.32     0.32     0.00     0.00   328286  String#size (ruby_runtime:0}
  1.32     14.55     0.28     0.00    14.27   109426  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:131}
  1.21      0.84     0.26     0.00     0.58   109417  Kernel#__send__ (ruby_runtime:0}
  1.13      0.24     0.24     0.00     0.00   109429  String#<< (ruby_runtime:0}
  0.99      0.30     0.21     0.00     0.09   109428  Fixnum#== (ruby_runtime:0}
  0.84      0.18     0.18     0.00     0.00   328289  Fixnum#+ (ruby_runtime:0}
  0.77      0.16     0.16     0.00     0.00   109427  <Class::Class>#allocate (ruby_runtime:0}
  0.77      0.58     0.16     0.00     0.41   109415  IO#<< (ruby_runtime:0}
  0.73      0.28     0.16     0.00     0.12   109475  Hash#[] (ruby_runtime:0}
...

== 1.8.6 core + no timeout block ==
      user     system      total        real
open  1.757812   0.765625   2.523438 (  2.873194)

 %self     total     self     wait    child    calls  name
 21.62      2.63     2.63     0.00     0.00   109426  IO#sysread (ruby_runtime:0}
 15.72      1.91     1.91     0.00     0.00   109429  String#<< (ruby_runtime:0}
 13.98      1.70     1.70     0.00     0.00   109416  IO#write (ruby_runtime:0}
  9.56     12.16     1.16     0.00    10.99        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  7.83      3.73     0.95     0.00     2.77   109426  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  5.58      4.68     0.68     0.00     4.00   109426  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  3.66      4.99     0.45     0.00     4.55   109426  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:138}
  3.34      2.38     0.41     0.00     1.97   109415  #<Class:0x801ac82b8>#<< ((eval):1}
  3.27      0.53     0.40     0.00     0.13   109435  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:142}
  2.95      5.04     0.36     0.00     4.68   109426  Net::ReadAdapter#<< (/usr/local/lib/ruby/1.8/net/protocol.rb:376}
  2.25      0.27     0.27     0.00     0.00   328289  Fixnum#+ (ruby_runtime:0}
  1.54      0.19     0.19     0.00     0.00   328286  String#size (ruby_runtime:0}
  1.28      0.27     0.16     0.00     0.12   109475  Hash#[] (ruby_runtime:0}
  1.22      0.15     0.15     0.00     0.00   109431  Module#=== (ruby_runtime:0}
  1.22      1.85     0.15     0.00     1.70   109415  IO#<< (ruby_runtime:0}
  1.09      0.13     0.13     0.00     0.00   109435  String#slice! (ruby_runtime:0}
  0.96      1.97     0.12     0.00     1.85   109417  Kernel#__send__ (ruby_runtime:0}
  0.96      0.12     0.12     0.00     0.00   109426  String#length (ruby_runtime:0}
  0.96      0.12     0.12     0.00     0.00   109435  Hash#default (ruby_runtime:0}
  0.71      0.09     0.09     0.00     0.00   109439  Fixnum#< (ruby_runtime:0}
...

== 1.8.6 core + no timeout block + 10K sysread ==
      user     system      total        real
open  0.570312   1.406250   1.976562 (  2.521614)

 %self     total     self     wait    child    calls  name
 27.32      0.39     0.39     0.00     0.00    10953  String#slice! (ruby_runtime:0}
 18.58      0.27     0.27     0.00     0.00    10943  IO#write (ruby_runtime:0}
 12.02      0.17     0.17     0.00     0.00    10944  IO#sysread (ruby_runtime:0}
  8.74      1.42     0.12     0.00     1.30        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  7.10      0.50     0.10     0.00     0.40    10944  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  5.46      0.37     0.08     0.00     0.29    10942  #<Class:0x801ac8b28>#<< ((eval):1}
  3.28      0.05     0.05     0.00     0.00    32840  String#size (ruby_runtime:0}
  2.73      0.55     0.04     0.00     0.52    10944  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  2.73      0.04     0.04     0.00     0.00    32843  Fixnum#+ (ruby_runtime:0}
  2.73      0.59     0.04     0.00     0.55    10944  Net::ReadAdapter#<< (/usr/local/lib/ruby/1.8/net/protocol.rb:376}
  1.64      0.21     0.02     0.00     0.19    10944  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:138}
  1.64      0.41     0.02     0.00     0.39    10953  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:142}
  1.09      0.02     0.02     0.00     0.00    10944  String#length (ruby_runtime:0}
  1.09      0.02     0.02     0.00     0.00    10947  String#<< (ruby_runtime:0}
  1.09      0.02     0.02     0.00     0.00    10953  Hash#default (ruby_runtime:0}
  1.09      0.28     0.02     0.00     0.27    10942  IO#<< (ruby_runtime:0}
...

== 1.8.6 core + no timeout block + 100K sysread ==
      user     system      total        real
open  0.390625   0.484375   0.875000 (  1.013516)

 %self     total     self     wait    child    calls  name
 45.45      0.43     0.43     0.00     0.00     1098  IO#sysread (ruby_runtime:0}
 33.88      0.32     0.32     0.00     0.00     1097  IO#write (ruby_runtime:0}
  8.26      0.08     0.08     0.00     0.00     1107  String#slice! (ruby_runtime:0}
  1.65      0.45     0.02     0.00     0.44     1098  Net::BufferedIO#rbuf_fill (/usr/local/lib/ruby/1.8/net/protocol.rb:138}
  1.65      0.02     0.02     0.00     0.00     3302  String#size (ruby_runtime:0}
  1.65      0.35     0.02     0.00     0.34     1098  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  1.65      0.93     0.02     0.00     0.91        1  Net::BufferedIO#read (/usr/local/lib/ruby/1.8/net/protocol.rb:79}
  0.83      0.01     0.01     0.00     0.00        1  URI::HTTP#initialize (/usr/local/lib/ruby/1.8/uri/http.rb:77}
  0.83      0.09     0.01     0.00     0.08     1107  Net::BufferedIO#rbuf_consume (/usr/local/lib/ruby/1.8/net/protocol.rb:142}
  0.83      0.01     0.01     0.00     0.00      357  Module#method_added (ruby_runtime:0}
  0.83      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
  0.83      0.01     0.01     0.00     0.00     1101  String#<< (ruby_runtime:0}
  0.83      0.36     0.01     0.00     0.35     1098  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  0.83      0.33     0.01     0.00     0.32     1096  IO#<< (ruby_runtime:0}
...

== 1.8.6 core + no timeout block + 1M sysread ==
      user     system      total        real
open  0.445312   0.335938   0.781250 (  0.913736)

 %self     total     self     wait    child    calls  name
 29.66      0.27     0.27     0.00     0.00      157  IO#sysread (ruby_runtime:0}
 28.81      0.27     0.27     0.00     0.00      166  String#slice! (ruby_runtime:0}
 28.81      0.27     0.27     0.00     0.00      156  IO#write (ruby_runtime:0}
  7.63      0.07     0.07     0.00     0.00      160  String#<< (ruby_runtime:0}
  1.69      0.30     0.02     0.00     0.28      157  Net::ReadAdapter#call_block (/usr/local/lib/ruby/1.8/net/protocol.rb:385}
  0.85      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
  0.85      0.01     0.01     0.00     0.00       98  Class#inherited (ruby_runtime:0}
  0.85      0.02     0.01     0.00     0.02        2  Kernel#gem_original_require (ruby_runtime:0}
  0.85      0.28     0.01     0.00     0.27      157  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
...

== 1.8.6 core + no timeout block + 1M sysread + slice bypass ==
      user     system      total        real
open  0.093750   0.523438   0.617188 (  0.717748)

 %self     total     self     wait    child    calls  name
 66.67      0.23     0.23     0.00     0.00       52  IO#write (ruby_runtime:0}
 13.33      0.05     0.05     0.00     0.00       52  IO#sysread (ruby_runtime:0}
  4.44      0.02     0.02     0.00     0.00       55  String#<< (ruby_runtime:0}
  4.44      0.02     0.02     0.00     0.00        1  TCPSocket#initialize (ruby_runtime:0}
  2.22      0.01     0.01     0.00     0.00      101  Hash#[] (ruby_runtime:0}
  2.22      0.01     0.01     0.00     0.00       10  String#slice! (ruby_runtime:0}
  2.22      0.02     0.01     0.00     0.01        2  Kernel#gem_original_require (ruby_runtime:0}
  2.22      0.25     0.01     0.00     0.24       52  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  2.22      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
...

Note: The change does also raises questions about the core slice method not performing as well as it might for the no-op case.

== 1.8.6 core + select timeout + 1M sysread + slice bypass ==
      user     system      total        real
open  0.070312   0.226562   0.296875 (  0.898285)

 %self     total     self     wait    child    calls  name
 64.10      0.20     0.20     0.00     0.00       49  IO#write (ruby_runtime:0}
 15.38      0.05     0.05     0.00     0.00       49  IO#sysread (ruby_runtime:0}
  7.69      0.02     0.02     0.00     0.00       52  String#<< (ruby_runtime:0}
  5.13      0.23     0.02     0.00     0.21       49  OpenURI::Buffer#<< (/usr/local/lib/ruby/1.8/open-uri.rb:297}
  2.56      0.02     0.01     0.00     0.02        2  Kernel#gem_original_require (ruby_runtime:0}
  2.56      0.01     0.01     0.00     0.00       37  <Class::Dir>#[] (ruby_runtime:0}
  2.56      0.02     0.01     0.00     0.01        3  Array#each (ruby_runtime:0}
...

So overall these changes improve the speed from 22seconds to 0.7seconds in my test case making ruby comparable with wget's performance.

The patch for this last run is attached and should be compatible with existing functionality through the use of select + raise Timeout::Error instead of timeout( @read_timeout ) { .. }

Hope this helps, any questions let me know


----------------------------------------
http://redmine.ruby-lang.org