Issue #13618 has been updated by ioquatix (Samuel Williams).


I'm still not clear why a new name needs to be introduced. `Fiber` should be sufficient IMHO, if you want to enable/disable auto yield on blocking operations, perhaps make it an option to Fiber, e.g. `Fiber.new(non_blocking: true)`. That way, it would be a minimally invasive drop-in to existing code, and it would be easy to switch between blocking and non-blocking behaviour by client code. That, and you wouldn't need to invent a name which relates to chicken farming :p

It would be also be pretty awesome if you could actually supply a reactor to use, e.g. `Fiber.new(io_reactor: my_reactor)`. In this case, blocking operations would call `my_reactor.wait_readable(io)` and `my_reactor.wait_writable(io)`. Something like this allows for the IO policy to be more flexible than global per-process reactor or other similar implementation. I'm still personally against a global IO reactor as it's an unnecessary point of thread contention and complexity (i.e. how does it work with fork? does every IO operation require locking a mutex?)

So, as mentioned earlier, `libpq` and the associated `pq` gems won't suddenly become asynchronous because of this patch (seems like there is a bit of a misunderstanding how this works under the hood). In fact, we can already achieve massive concurrency improvements using `async`, and I've tested this using `puma` and `falcon`. The difference was pretty big! I wrote up a summary here: https://github.com/socketry/async-postgres#performance - feel free to come and chat in https://gitter.im/socketry/async as there are a quite a few people there who are interested in the direction of asynchronous IO in Ruby.

So, again, I think this patch is simply does too much. It should be split into 1/ a standard IO reactor for Ruby with a standard interface that others can implement (could easily be a gem) and 2/ Expand Fiber to support non-blocking operations by way of a supplied IO reactor (very minimal surface area/names required). Which, for the most part, is close how https://github.com/socketry/async works and if you take this approach `async` could build on top of it. I don't think it's a stupid idea to allow things like EventMachine, async, and other IO reactors to work together.

Just FYI, I'm not sure what visibility you have on other projects, but there are at least two I know of implementing similar concepts:

https://github.com/chuckremes/ruby-io

https://github.com/socketry/lightio

Even as the author of async, I don't feel it's a one size fits all solution and I don't even want `async` to become a standard solution. I think it's great we have options like the above, and I think if we design the Fiber API correctly, it should absolutely be possible to a/ work across different implementations of Ruby efficiently and b/ compose existing and new IO reactors/models without hurting backwards compatibility.

I think that if people want to implement their own IO scheduling policies, on a per-fiber basis, that would be pretty awesome.

----------------------------------------
Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
https://bugs.ruby-lang.org/issues/13618#change-70962

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: normalperson (Eric Wong)
* Target version: 
----------------------------------------
```
auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

Implement automatic Fiber yield and resume when running
rb_wait_for_single_fd and rb_waitpid.

The Ruby API changes for Fiber are named after existing Thread
methods.

main Ruby API:

    Fiber#start -> enable auto-scheduling and run Fiber until it
		   automatically yields (due to EAGAIN/EWOULDBLOCK)

The following behave like their Thread counterparts:

    Fiber.start - Fiber.new + Fiber#start (prelude.rb)
    Fiber#join - run internal scheduler until Fiber is terminated
    Fiber#value - ditto
    Fiber#run - like Fiber#start (prelude.rb)

Right now, it takes over rb_wait_for_single_fd() and
rb_waitpid() function if the running Fiber is auto-enabled
(cont.c::rb_fiber_auto_sched_p)

Changes to existing functions are minimal.

New files (all new structs and relations should be documented):

    iom.h - internal API for the rest of RubyVM (incomplete?)
    iom_internal.h - internal header for iom_(select|epoll|kqueue).h
    iom_epoll.h - epoll-specific pieces
    iom_kqueue.h - kqueue-specific pieces
    iom_select.h - select-specific pieces
    iom_pingable_common.h - common code for iom_(epoll|kqueue).h
    iom_common.h - common footer for iom_(select|epoll|kqueue).h

Changes to existing data structures:

    rb_thread_t.afrunq   - list of fibers to auto-resume
    rb_vm_t.iom          - Ruby I/O Manager (rb_iom_t) :)

Besides rb_iom_t, all the new structs are stack-only and relies
extensively on ccan/list for branch-less, O(1) insert/delete.

As usual, understanding the data structures first should help
you understand the code.

Right now, I reuse some static functions in thread.c,
so thread.c includes iom_(select|epoll|kqueue).h

TODO:

    Hijack other blocking functions (IO.select, ...)

I am using "double" for timeout since it is more convenient for
arithmetic like parts of thread.c.   Most platforms have good FP,
I think.  Also, all "blocking" functions (rb_iom_wait*) will
have timeout support.

./configure gains a new --with-iom=(select|epoll|kqueue) switch

libkqueue:

  libkqueue support is incomplete; corner cases are not handled well:

    1) multiple fibers waiting on the same FD
    2) waiting for both read and write events on the same FD

  Bugfixes to libkqueue may be necessary to support all corner cases.
  Supporting these corner cases for native kqueue was challenging,
  even.  See comments on iom_kqueue.h and iom_epoll.h for
  nuances.

Limitations

Test script I used to download a file from my server:
----8<---
require 'net/http'
require 'uri'
require 'digest/sha1'
require 'fiber'

url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx'

uri = URI(url)
use_ssl = "https" == uri.scheme
fibs = 10.times.map do
  Fiber.start do
    cur = Fiber.current.object_id
    # XXX getaddrinfo() and connect() are blocking
    # XXX resolv/replace + connect_nonblock
    Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http|
      req = Net::HTTP::Get.new(uri)
      http.request(req) do |res|
    dig = Digest::SHA1.new
    res.read_body do |buf|
      dig.update(buf)
      #warn "#{cur} #{buf.bytesize}\n"
    end
    warn "#{cur} #{dig.hexdigest}\n"
      end
    end
    warn "done\n"
    :done
  end
end

warn "joining #{Time.now}\n"
fibs[-1].join(4)
warn "joined #{Time.now}\n"
all = fibs.dup

warn "1 joined, wait for the rest\n"
until fibs.empty?
  fibs.each(&:join)
  fibs.keep_if(&:alive?)
  warn fibs.inspect
end

p all.map(&:value)

Fiber.new do
  puts 'HI'
end.run.join
```


---Files--------------------------------
0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>