Issue #17325 has been updated by nevans (Nicholas Evans).


Thanks for taking a look at this, Benoit. I agree it's not obvious why this is necessary with Fiber#raise, so I'll try to explain my reasoning in more detail:

Yes, a library (e.g. `async`) could write a "suspend" function that wraps resume, yield, and transfer. That library could implement cancel _(and raise)_ by tunneling resume types & values through a struct. That library can't directly control fibers that are created outside its wrapper (e.g. `Enumerator`), but our "suspend" wrapper could implement a "trampoline" fiber to interoperate with most simple cases like `Enumerable`. I know all of this is possible, because I've been using a suspend wrapper like this on one of my work projects for a few years now! :) Still, it's not hard to imagine incompatible fiber-creating libraries that circumvent our wrapper in unpredictable ways. Putting these into `Fiber` gives all fiber-using applications and libraries a shared baseline they can rely upon.

I disagree on "not a common operation". I think that relatively short-lived fibers could become as common-place as relatively short-lived `goroutines` are in `go` (and I hope they do).  And I think non-exceptional cancellation is a very important concept for asynchronous code to handle explicitly.  Other languages with coroutines special case "cancel" (even if some of them tunnel cancelation via error types).  E.g. go's `Context` type uses `Done()` and the go vet tool checks that `CancelFuncs` are used on all control-flow paths.  Kotlin coroutine's `Job` class has cancel methods but *not* raise methods.

### Propagation

The most important difference from `Fiber#raise` isn't that it is unrescuable or uses fewer resources, but how propagation works:

 * You can cancel *any* living fiber except the root-fiber.
   * I've disallowed canceling the root to avoid accidentally shutting the thread down.
   * I'm currently raising `FiberError` for terminated fibers, but I think I'd like to change that. I think canceling dead fibers should simply return `nil` or `false`. That way we can safely call cancel on any fiber, and the only reason it would raise an exception is if the child fiber terminates with an exception during cancellation.
 * Cancel propagates _down_ through the linked list of resumed fibers.
 * Execution of `ensure` still runs from bottom up.
 * It does _not_ propagate an error or cancel _up_ from the canceled fiber.

E.g. if an async task fiber is resuming into an lazy enumerator which is resuming into another fiber (and so on) and your thread scheduler wants to cancel your task's fiber but doesn't know about those other fibers, `Fiber#raise` won't work.  If your fiber is transferring, `Fiber#raise` won't work. `Fiber#cancel` will work on any living fiber while still following the rules (#17221) for transferring control.

Just as `break` doesn't generally need to know or care about the implementation of intervening library code, canceling a fiber shouldn't need to know or care about what the implementation of any sub-fibers may have been resumed.

### semantics of raise/rescue vs unexceptional break/return

I'm not against temporarily or explicitly blocking cancellation. Avoiding swallowed exceptions is not the most important feature, but it's still a useful one I think. And any small performance gain would be desirable, but not the primary driver.  Aside from either of those, `raise` has different semantics from `break` or `return` (or `throw`). (As currently written) this is only for semantically *unexceptional* flow-control.  And this isn't a matter of "application code can handle it" because applications can't control their intervening library code, nor can they control fibers created by intervening library code.  It's quite common to see code like the following:

```ruby
def foo
  some_library_code_runs_this_block_many_stack_layers_deep do
    result = etc_etc_etc
    return result.retval if result.finished?
  end
end
```

We *could* wrap this in an exception handler, but it would be more confusing to the casual reader than simply using `return` or `break` (or maybe `catch` and `throw`).  For jumping to the `ensure` block of a particular method we use `return`.  For block scope, `break`.  For fiber-scope: `Fiber#cancel`.

I expect that `return` statement to non-exceptionally rewind the stack without being caught by any `catch` or `rescue`.  I don't want a library's hidden `rescue Exception` to subvert my `break` or `return` (libraries shouldn't do this, but sometimes they do).

It's not as simple as an "application problem". Task cancellation could be triggered by application code or by library code. A task-scheduler library might call `Fiber#cancel`, and the fibers being canceled might be in application code or in library code or might be suspended by resuming into fibers that are uncontrolled by or even unknown to task-scheduler. None of that should matter.

Wrapping a task with `catch {|tag| ... }` would be conceptually better than exception handling... but `throw tag` from an `Enumerator` won't propagate up to the return fiber.  (I don't want to change this behavior.)

```
ruby -e 'f = Fiber.new { throw :foo }; p((catch(:foo) { f.resume } rescue nil))'
#<UncaughtThrowError: uncaught throw :foo>
```

### Examples

To be clear, these are toy examples and I'd want most of the following to be handled for me by a fiber-task-scheduler library (like `async`). But that library itself should have a mechanism for canceling resuming tasks, even when it doesn't (or can't) know about the resumed child fibers of those tasks. `Fiber#raise` (as currently written) can't do that.

```ruby
def run_server
  server = MyFiberyTCPServer.open
  # Do stuff. e.g. accept clients, assign connections to fibers, etc.
  # Those connections can create their own sub-fibers.
  # The server may know nothing about those sub-fibers. It shouldn't need to.
  # Those subfibers might even use an entirely different scheduler. That's okay.
  # Connection fibers might be un-resumable because they are resuming. No prob.
  wait_for_shutdown_signal # => transfers to some sort of fiber scheduler
ensure
  # cancels all connection-handler fibers
  server.connections.each do |c|
    # Are those connection fibers resuming other sub-fibers tasks?
    # Do we even know about those sub-tasks?
    # Can we even know about them from here?
    # Who cares? Those need to be canceled too!
    c.cancel :closing if c.alive?
    # I'd like to make dead_fiber.cancel unexceptional too
  end
end

# fetching a resource may depend on fetching several other resources first
def resource_client
  a = schedule_future { http.get("/a") }
  b = schedule_future { http.get("/b") }
  items = a.value.item_ids.map {|id| http.get("/items/#{id}") }
  combine_results(b, ary)
ensure
  # if any of the above raises an exception
  # or if *this* fiber is canceled
  # of if combine_results completed successfully before all subtasks complete
  a&.cancel rescue nil # is it resuming another fiber? don't know, don't care.
  b&.cancel rescue nil # is it resuming another fiber? don't know, don't care.
  ary&.each do |item| item.cancel rescue nil end # ditto
end

# yes, task library code would normally provide a better pattern for this
def with_timeout(seconds)
  timer = Task.schedule do
    sleep seconds
  ensure
    task.cancel :timeout
  end
  task = Task.schedule do
    yield # does this resume into sub-tasks? we shouldn't need to know.
  ensure
    timer.cancel
  end
  task.value
end
```

### No guarantees

And yes, we can always have misbehaving code and I'm not trying to guarantee against every case. We can't guard against certain categories of bugs nor infinite loops. It's always possible someone's written:

```ruby
def foo
  bar
ensure
  while true
    begin
      while true
        Fiber.yield :misbehaving
      end
    rescue Exception
      # evil code being evil
    end
  end
end
```

But that's entirely outside the scope of this. :)

We can have bugs here just like any code can have bugs. But in my experience, `ensure` code is usually *much* shorter and simpler than other code.  Shut down, clean up, release, and reset.

----------------------------------------
Feature #17325: Adds Fiber#cancel, which forces a Fiber to break/return
https://bugs.ruby-lang.org/issues/17325#change-88553

* Author: nevans (Nicholas Evans)
* Status: Open
* Priority: Normal
----------------------------------------

Calling `Fiber#cancel` will force a fiber to return, skipping rescue and catch blocks but running all ensure blocks. It behaves as if a `break` or `return` were used to jump from the last suspension point to the top frame of the fiber. Control will be transferred to the canceled fiber so it can run its ensure blocks.

## Propagation from resuming to resumed fibers

Any non-root living fiber can be canceled and cancellation will propagate to child (resumed) fibers. In this way, a suspended task can be canceled even if it is e.g. resuming into an enumerator, and the enumerator will be canceled as well. Transfer of control should match #17221's *(much improved)* transfer/resume semantics. After the cancellation propagates all the way to the bottom of the fiber resume stack, the last fiber in the chain will then be resumed. Resuming fibers will not run until they are yielded back into.

## Suspension of canceled fibers

Canceled fibers can still transfer control with `resume`, `yield`, and `transfer`, which may be necessary in order to release resources from `ensure` blocks. For simplicity, subsequent cancels will behave similarly to calling `break` or `return` inside an `ensure` block, and the last cancellation reason will overwrite earlier reasons.

## Alternatives

`Fiber#raise` could be used, but:
* Exceptions are bigger and slower than `break`.
* `#raise` can't (and shouldn't) be sent to resuming fibers. (It can't propagate.)
* Exceptions can be caught. This might be desirable, but that should be at the discretion of the calling fiber.

Catch/Throw could be used (with an anonymous `Object.new`), but:
* `catch` adds an extra stack frame.
* It would need to add `Fiber#throw` (or wrap/intercept `Fiber.yield`).
* A hypothetical `Fiber#throw` should probably only be allowed on yielding fibers (like `Fiber#resume`). (It wouldn't propagate.)

Implementation:  https://github.com/ruby/ruby/pull/3766



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>