Issue #15499 has been updated by dentarg (Patrik Ragnarsson).


After switching to Ruby 2.6.0 for our application on Heroku, we have been getting [R12 - Exit timeout](https://devcenter.heroku.com/articles/error-codes#r12-exit-timeout) errors after each deploy. The application is using Puma in cluster mode. While we haven't tried to reproduce this on Heroku, one of our tests starts our application just to see that it can boot, and we noticed that with 2.6.0, the TERM signal wasn't enough to stop the application. Our test now sends KILL after a while.

I've tried to do a minimal repro of the situation at https://github.com/dentarg/gists/tree/master/gists/ruby-bug-15499. I've sampled the hanged processes (that starts eating ~100% CPU) with Activity Monitor, see the txt files, maybe that can give some clue.

----------------------------------------
Bug #15499: Breaking behavior on ruby 2.6: rb_thread_call_without_gvl doesn't invoke unblock_function when used on the main thread 
https://bugs.ruby-lang.org/issues/15499#change-76116

* Author: apolcyn (alex polcyn)
* Status: Assigned
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* Target version: 
* ruby -v: 2.6.0
* Backport: 2.4: DONTNEED, 2.5: DONTNEED, 2.6: REQUIRED
----------------------------------------
This issue was noticed when trying to add ruby 2.6 support to the "grpc" ruby gem (this gem is a native C-extension), and was caught by a unit test.

There are several APIs on the grpc ruby gem (https://github.com/grpc/grpc/tree/master/src/ruby) that invoke "rb_thread_call_without_gvl" on the current thread, doing a blocking operation in the "without gvl" callback and cancel that blocking operation in the "unblocking function". These APIs work in ruby versions prior to ruby 2.6 (e.g. ruby 2.5), but have problems when used on ruby 2.6

Minimal repro:

My system:

> lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 9.6 (stretch)
Release:	9.6
Codename:	stretch

> ruby -v
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-linux

# I installed ruby 2.6.0 with rvm - https://rvm.io/rvm/install

> GRPC_CONFIG=dbg gem install grpc --platform ruby # build grpc gem from source with debug symbols

ruby script, "repro.rb" that looks like this:

"""
require 'grpc'

ch = GRPC::Core::Channel.new('localhost:1234', {}, :this_channel_is_insecure)
ch.watch_connectivity_state(ch.connectivity_state, Time.now + 360)
"""

Run "ruby repro.rb" with an interactive shell, and it will hang there. At this point, ctrl^C the process, and it will not terminate.
What should happen is this unblocking func should be invoked: https://github.com/grpc/grpc/blob/master/src/ruby/ext/grpc/rb_channel.c#L354, but as seen with logging or debuggers, that unblocking func is never ran. Thus the blocking operation never completes and the main thread is stuck.

When the same repro.rb is ran on e.g. ruby 2.5.3 or ruby 2.4.1, the blocking operation is unblocked and the process terminates, as expected, when sending it a SIGINT.

Also note that if the blocking operation is put in a background thread, e.g. with this script:
"""
require 'grpc'

th = Thread.new do
  ch = GRPC::Core::Channel.new('localhost:1234', {}, :this_channel_is_insecure)
  ch.watch_connectivity_state(ch.connectivity_state, Time.now + 360)
end
th.join
"""

then "unblocking" functions will in fact be invoked upon sending the process a SIGINT, so this looks like a problem specifically with rb_thread_call_without_gvl being used on the main thread.

Please let me know and I can provide more details or alternative repro cases.

Thanks in advance.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>