Issue #13618 has been updated by ko1 (Koichi Sasada).


sorry for long absent about this topic. it is hard task (hard to start writing up because of problem difficulties and my English skil ;p ) to summarize about this topic.

I try to write step by step.

----

# Discussion at last developers meeting

## Thread/Fiber switch safety

Koichi: (repeat my opinion about difficulty of thread/fiber safety)

akr: providing better synchronize mechanism (such as go-lang has) and encouraging safe parallel computation seems better.

Koichi: It is one possible solution but my position is "if people can shoot their foot, people will shoot".

Matz: I don't like to force people to use lock and so on.

(the point is Matz doesn't reject "-safe" approach)

## Introduce restriction

(The following idea is not available at last meeting (only part of idea I showed))

Koichi:
The problem of this feature is mind gap using auto-fiber user and script writer. This is same as thread-safety. Person A consider the code is auto-fiber safe, and other person B (can be same as A) write a code without auto-fiber safety, then it will be problem.

In general, most of existing libraries are not auto-fiber safe code (it doesn't mean most of libraries are not auto-fiber safe. Many code are auto-fiber safe without any care).

If we can know a code (and code called by this code) is auto-fiber safe, we can use auto-fiber in safe.

There are three type of code.

* (1) don't care about auto-fiber
* (2) auto-fiber aware code (assume switching is not allowed at the beginning)
* (3) auto-fiber aware code (don't care it is allowed or not allowed to switch)

There are three types of status.

* (a) can't switch
* (b) can enable to switch, but don't switch
* (c) can switch

in matrix

```
    can switch / can enable switch
(a) can't      / can't
(b) can't      / can
(c) can        / ??
```

matrix with (1-3) and (a-c)

```
     (a)     (b)     (c)
(1)   OK      NG      NG
(2)   OK      OK      NG
(3)   OK(*1)  OK(*1)  OK
```

(1)-(b) and (1)-(c) is not accepted because other method called from this code can switch the context.
(2)-(c) is also unacceptable because the beginning of code is not auto-fiber aware.

*1) Possible problem: (3) can introduce dead-lock problem because it can stop forever.

Normal threads start from (a).
Auto-fibers start from (b). They are written in (1), (2) and (3). Maybe (2) is written for auto fiber top-lelvel. This code will call some async methods which can change context.

My proposal is, to write down explicitly of (1) to (3) and (a) to (c) in program.

At the meeting, I proposed non-matured keywords(-like) to control them.
(and just now I don't have good syntax for it yet)

akira: If we introduce such keywords, we need to rewrite all of code if we want to use auto-fiber web application request handler (for example, we need to rewrite Rails to run on auto-fiber based rack server).

Matz: it is unacceptable to introduce huge rewriting for existing code.

(IMO (not appeared in last meeting) we need to rewrite all of code even if we don't introduce keywords to make sure the auto-fiber safety) 

# after this discussion

Matz and I discussed about this issue, and we conclude that it is too early to introduce this feature on Ruby 2.5.

----

I want to consider this issue further. auto-fiber based guild is one possibility, this mean we can introduce object isolation and context switching each other.


----------------------------------------
Feature #13618: [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
https://bugs.ruby-lang.org/issues/13618#change-65757

* Author: normalperson (Eric Wong)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
```
auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

Implement automatic Fiber yield and resume when running
rb_wait_for_single_fd and rb_waitpid.

The Ruby API changes for Fiber are named after existing Thread
methods.

main Ruby API:

    Fiber#start -> enable auto-scheduling and run Fiber until it
		   automatically yields (due to EAGAIN/EWOULDBLOCK)

The following behave like their Thread counterparts:

    Fiber.start - Fiber.new + Fiber#start (prelude.rb)
    Fiber#join - run internal scheduler until Fiber is terminated
    Fiber#value - ditto
    Fiber#run - like Fiber#start (prelude.rb)

Right now, it takes over rb_wait_for_single_fd() and
rb_waitpid() function if the running Fiber is auto-enabled
(cont.c::rb_fiber_auto_sched_p)

Changes to existing functions are minimal.

New files (all new structs and relations should be documented):

    iom.h - internal API for the rest of RubyVM (incomplete?)
    iom_internal.h - internal header for iom_(select|epoll|kqueue).h
    iom_epoll.h - epoll-specific pieces
    iom_kqueue.h - kqueue-specific pieces
    iom_select.h - select-specific pieces
    iom_pingable_common.h - common code for iom_(epoll|kqueue).h
    iom_common.h - common footer for iom_(select|epoll|kqueue).h

Changes to existing data structures:

    rb_thread_t.afrunq   - list of fibers to auto-resume
    rb_vm_t.iom          - Ruby I/O Manager (rb_iom_t) :)

Besides rb_iom_t, all the new structs are stack-only and relies
extensively on ccan/list for branch-less, O(1) insert/delete.

As usual, understanding the data structures first should help
you understand the code.

Right now, I reuse some static functions in thread.c,
so thread.c includes iom_(select|epoll|kqueue).h

TODO:

    Hijack other blocking functions (IO.select, ...)

I am using "double" for timeout since it is more convenient for
arithmetic like parts of thread.c.   Most platforms have good FP,
I think.  Also, all "blocking" functions (rb_iom_wait*) will
have timeout support.

./configure gains a new --with-iom=(select|epoll|kqueue) switch

libkqueue:

  libkqueue support is incomplete; corner cases are not handled well:

    1) multiple fibers waiting on the same FD
    2) waiting for both read and write events on the same FD

  Bugfixes to libkqueue may be necessary to support all corner cases.
  Supporting these corner cases for native kqueue was challenging,
  even.  See comments on iom_kqueue.h and iom_epoll.h for
  nuances.

Limitations

Test script I used to download a file from my server:
----8<---
require 'net/http'
require 'uri'
require 'digest/sha1'
require 'fiber'

url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx'

uri = URI(url)
use_ssl = "https" == uri.scheme
fibs = 10.times.map do
  Fiber.start do
    cur = Fiber.current.object_id
    # XXX getaddrinfo() and connect() are blocking
    # XXX resolv/replace + connect_nonblock
    Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http|
      req = Net::HTTP::Get.new(uri)
      http.request(req) do |res|
    dig = Digest::SHA1.new
    res.read_body do |buf|
      dig.update(buf)
      #warn "#{cur} #{buf.bytesize}\n"
    end
    warn "#{cur} #{dig.hexdigest}\n"
      end
    end
    warn "done\n"
    :done
  end
end

warn "joining #{Time.now}\n"
fibs[-1].join(4)
warn "joined #{Time.now}\n"
all = fibs.dup

warn "1 joined, wait for the rest\n"
until fibs.empty?
  fibs.each(&:join)
  fibs.keep_if(&:alive?)
  warn fibs.inspect
end

p all.map(&:value)

Fiber.new do
  puts 'HI'
end.run.join
```


---Files--------------------------------
0001-auto-fiber-schedule-for-rb_wait_for_single_fd-and-rb.patch (82.8 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>