Charles Nutter <headius / headius.com> wrote:
> I wonder, though, if depending on this behavior is leading Ruby more
> and more down the GVL path. The designers of the JVM's core IO
> libraries, for example, were unable to reconcile concurrent native
> threads with interruptible IO, due to the impossibility of knowing
> what state all IO-related data structures are in when the thread is
> interrupted.

I don't think so, even if threads are interrupted they're resumed after
the signal handler is done (or the process is dying anyways and we don't
care).  If the interrupt is to raise an exception then that could get
messy[1], but for the general case of signal handlers it's not an issue.

> As a result, IO channels performing blocking operations
> are explicitly closed when the thread they block is interrupted.

That is terrible.  I'd never touch a platform that does that.

> It seems that your change (and others like it) makes Ruby even more
> dependent on kernel-level blocking IO operations always being safely
> interruptible, and depending on those interruptions to only occur at
> the exact boundaries defined by the GVL. A future concurrent-threaded
> Ruby (or other impls that may become concurrent-threaded) may want to
> consider this, no? And are there any cross-platform concerns from
> eliminating select in these cases?

If there are cross-platform concerns, the functions that wrap select()
should be made no-op on platforms where select() is not needed (on
all POSIX-like ones, I expect) and not interfere with platforms where
they're not needed.

Regardless, there'll always be a set of IO operations that can never be
interrupted.  That doesn't bother me at all since the rest of the VM
still runs.  I'd rather just not use select()/poll() at all for
"blocking" I/O calls.

> I also wonder if there's a race condition here; is it not possible
> that the interrupt of a thread would fire immediately after the GVL
> has been released but before the blocking IO operation has fired?
> Perhaps I'm birdwalking too deep into the vagaries of MRI's IO logic.

So a signal handler might fire and the syscall would just continue and
not fail with EINTR.  No big deal, it'll just finish the syscall before
checking for interrupts.

The real race condition is relying on select()/poll() at all for
readability.  select()/poll() returning success _never_ guarantees an
operation won't block due to spurious wakeups and shared IO across
multiple threads/processes.


[1] - which is why rb_ensure() is used in some places, such as using
      with select() for rb_fd_init()/rb_fd_term()

-- 
Eric Wong