Issue #8875 has been updated by headius (Charles Nutter).


I have been experimenting with some fixes for this.

In JRuby, my first fix was to make IO.select aware of SSLSocket's native buffers by adding a method to query if SSLSocket had buffered data itself. This adds the socket to the list of pending read streams and does not attempt to do a blocking select on it. This fixes the simple issue of sysread/sysread_nonblock reading more data than requested and potentially draining the stream and allows select to work correctly.

However, the buffering issue is harder to fix. I believe buffering.rb needs to go away entirely, or at least needs to not buffer data on its own.

A partial patch for buffering.rb (does not fix all uses of the Ruby-land buffer) allows my original case to run to completion, since the only buffers are the ones in SSLSocket proper and my IO.select patch can see those.

Here's the patch as it stands right now: https://gist.github.com/headius/6477733
----------------------------------------
Bug #8875: Select is not usable with SSLSocket
https://bugs.ruby-lang.org/issues/8875#change-41667

Author: headius (Charles Nutter)
Status: Open
Priority: Normal
Assignee: 
Category: ext/openssl
Target version: 
ruby -v: all
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


Because of the various levels of buffering SSLSocket employs, it is not possible to reliably use IO.select to check when it has data available.

SSLSocket wraps a normal IO that it uses for reading and writing unencrypted data. This IO has its own buffers, at the OS/libc level.

Select normally operates against IO, checking whether data has been buffered or is available on the wire. However, in order to decrypt data on the wire, SSLSocket often needs to read more data than it needs, potentially draining the stream. This is problem #1.

This problem can be mitigated by making IO.select know that it's an SSLSocket and that it may have its own buffers.

However, there's another layer of buffering that happens in openssl/buffering.rb, where read, readpartial, read_nonblock, and methods that call them eventually hit fill_rbuf, which can easily drain both the IO buffers and the SSLSocket buffers into a Ruby-land buffer IO.select does not know about.

An example script is here: https://gist.github.com/headius/6477345

In investigating why this hangs on JRuby (under the original assumption that it was a JRuby issue) I realized that fill_rbuff is reading 16k bytes at a time to try to fill its internal buffer. This effectively drains all data in all buffers visible to IO.select, causing select to hang after the first read.

ruby-head (a few months old), Ruby 1.9.3p253, Ruby 1.8.7p358, JRuby (all versions), and Rubinius (all versions) are affected, because we all share buffering.rb which is where the problem lies.

This may be a known issue, but we continue to get bug reports from Ruby users claiming JRuby is failing to support select + SSLSocket correctly. I'd like to figure out if there's anything we as a community can do to fix this.


-- 
http://bugs.ruby-lang.org/