Issue #13618 has been updated by ko1 (Koichi Sasada).


Thank you for your great work.

# summary of this comment

Recent days I'm thinking about this feature's "safety" or "dependability".
Because of this issue, I think it is difficult to employ this feature right now.

# Non-auto-fibers

Without this feature, Fiber switching is explicit (`Fiber.yield`) and most of case, it is easy to write several operations in atomic.

Typical atomic operation is increment. Let's think about it with example: `t = n; ...; n = t+1`.

```
def some_method
  Fiber.yield
end

n = 0
f1 = Fiber.new{
  t = n
  some_method
  n = t + 1
}

f1.resume
n += 1
f1.resume

p n #=> 1 (although two increments are tried)
```

In this case, main fiber and fiber f1 try to increment `n` and `some_method` breaks atomicity because of `Fiber.yield`. Of course, nobody write such silly code and it is easy to check because `Fiber.yield` is strongly coupled with Fiber operations written by users (basically, libraries don't call `Fiber.yield`).

# auto-fibers

However, auto-fiber switching introduce this kind of danger.

```
# assume all fibers are auto-scheduling fibers

n = 0
f1 = Fiber.new{
  n = log(t) + 1
}

f1.resume # auto-fibers should not call resume
          # but please allow me, this is pseudo-code to describe an issue.
n += 1
f1.resume

p n
```

If `log()` method tries to send a log message over network, Fiber will switch to other fibers.

Problems are:

* It is difficult to know which operations should be run in atomic (users write code without checking atomicity).
* It is difficult to find out which method can switch.
  * Not only user writing code, but also all library code can switch fibers.
  * This means that we need to check all of library code to know that they don't violate atomic assumptions.
* It introduced non-deterministic behavior (with `Fiber.yield` it will be deterministic behavior and it is easy to reproduce the problem).

This kind of difficulties are same as threading.  The impact can be smaller than threading (because threading can switch anywhere and it is very hard to predict the behavior. Auto-fibers switch only at blocking operations especially on IO operations).

# Consideration

To solve this behavior, we have several choice.

(1) Introduce synchronization mechanisms for auto-fibers

Like Mutex, Queue and so on.
On Ruby 1.8 era, we have `Thread.exclusive` to prohibit thread-switching.

I don't want to choice this option because it is what I want to avoid from Ruby.

(2) Introduce limitations

The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller.

(2-1) Introduce Fiber switching methods

Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep.

(2-2) Check context

Permit fiber switching only at permitted places, by block, pragma, and so on.

```
# auto-fiber: true # <- this file can switch fibers automatically
Fiber.new(auto: true){
  ...
  io.read # can switch
  ...
  something_defined_in_gem # can't switch
  ...
}
```

I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages.

(3) Something else cleaver

Introducing debugger is one choice (maybe it is easy than threading issues).
But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible).

Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people.

thought?


----------------------------------------
Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
https://bugs.ruby-lang.org/issues/13618#change-65205

* Author: normalperson (Eric Wong)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
```
auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

Implement automatic Fiber yield and resume when running
rb_wait_for_single_fd and rb_waitpid.

The Ruby API changes for Fiber are named after existing Thread
methods.

main Ruby API:

    Fiber#start -> enable auto-scheduling and run Fiber until it
		   automatically yields (due to EAGAIN/EWOULDBLOCK)

The following behave like their Thread counterparts:

    Fiber.start - Fiber.new + Fiber#start (prelude.rb)
    Fiber#join - run internal scheduler until Fiber is terminated
    Fiber#value - ditto
    Fiber#run - like Fiber#start (prelude.rb)

Right now, it takes over rb_wait_for_single_fd() and
rb_waitpid() function if the running Fiber is auto-enabled
(cont.c::rb_fiber_auto_sched_p)

Changes to existing functions are minimal.

New files (all new structs and relations should be documented):

    iom.h - internal API for the rest of RubyVM (incomplete?)
    iom_internal.h - internal header for iom_(select|epoll|kqueue).h
    iom_epoll.h - epoll-specific pieces
    iom_kqueue.h - kqueue-specific pieces
    iom_select.h - select-specific pieces
    iom_pingable_common.h - common code for iom_(epoll|kqueue).h
    iom_common.h - common footer for iom_(select|epoll|kqueue).h

Changes to existing data structures:

    rb_thread_t.afrunq   - list of fibers to auto-resume
    rb_vm_t.iom          - Ruby I/O Manager (rb_iom_t) :)

Besides rb_iom_t, all the new structs are stack-only and relies
extensively on ccan/list for branch-less, O(1) insert/delete.

As usual, understanding the data structures first should help
you understand the code.

Right now, I reuse some static functions in thread.c,
so thread.c includes iom_(select|epoll|kqueue).h

TODO:

    Hijack other blocking functions (IO.select, ...)

I am using "double" for timeout since it is more convenient for
arithmetic like parts of thread.c.   Most platforms have good FP,
I think.  Also, all "blocking" functions (rb_iom_wait*) will
have timeout support.

./configure gains a new --with-iom=(select|epoll|kqueue) switch

libkqueue:

  libkqueue support is incomplete; corner cases are not handled well:

    1) multiple fibers waiting on the same FD
    2) waiting for both read and write events on the same FD

  Bugfixes to libkqueue may be necessary to support all corner cases.
  Supporting these corner cases for native kqueue was challenging,
  even.  See comments on iom_kqueue.h and iom_epoll.h for
  nuances.

Limitations

Test script I used to download a file from my server:
----8<---
require 'net/http'
require 'uri'
require 'digest/sha1'
require 'fiber'

url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx'

uri = URI(url)
use_ssl = "https" == uri.scheme
fibs = 10.times.map do
  Fiber.start do
    cur = Fiber.current.object_id
    # XXX getaddrinfo() and connect() are blocking
    # XXX resolv/replace + connect_nonblock
    Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http|
      req = Net::HTTP::Get.new(uri)
      http.request(req) do |res|
    dig = Digest::SHA1.new
    res.read_body do |buf|
      dig.update(buf)
      #warn "#{cur} #{buf.bytesize}\n"
    end
    warn "#{cur} #{dig.hexdigest}\n"
      end
    end
    warn "done\n"
    :done
  end
end

warn "joining #{Time.now}\n"
fibs[-1].join(4)
warn "joined #{Time.now}\n"
all = fibs.dup

warn "1 joined, wait for the rest\n"
until fibs.empty?
  fibs.each(&:join)
  fibs.keep_if(&:alive?)
  warn fibs.inspect
end

p all.map(&:value)

Fiber.new do
  puts 'HI'
end.run.join
```


---Files--------------------------------
0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>