Issue #17220 has been updated by naruse (Yui NARUSE).


As glass_saga says,
* Before fork, all pthreads need to be stopped (this is considered Unix's practical restriction as far as I understand)
* getaddrinfo_a uses their own pthread to provide async feature. It has a queue and pthreads (up to 20 threads) to handle DNS requests. getaddrinfo_a with NOWAIT posts a task to the queue.
* With gai_cancel(3), it can remove queuing task. But it doesn't remove/stop already running task.
* With gai_error(3), it can get the status of the task.
* To stop those pthreads, it needs to stop posting new task, remove all queuing tasks, and then stop/wait running tasks. After that worker threads will finish after 1 second sleep. To skip that sleep, we need to call an internal glibc API __gai_new_request_notification().
* To stop posting new task, just getting GVL.
* To remove all queuing tasks, we can use gai_cancel. Though the manpage says gai_cancel(NULL), it actually pass all arguments one by one.
* There's no way to stop running tasks with glibc's getaddrinfo_a. To stop them, we need to re-implement getaddrinfo_a. With our own impl, we can use pthread_cancel(3) to stop getaddrinfo(3) in the threads. It seems that some issues are seen but actually we can do that. (ref. https://bugzilla.redhat.com/show_bug.cgi?id=1209433)
* In Ruby 3.0 we'll ensure to stop pthreads before fork. In the future, we'll provide further enchancements.

----------------------------------------
Bug #17220: Rails Active Job integration test fails with Ruby 3.0.0 since 2038cc6cab6ceeffef3ec3a765c70ae684f829ed
https://bugs.ruby-lang.org/issues/17220#change-88742

* Author: yahonda (Yasuo Honda)
* Status: Assigned
* Priority: Normal
* Assignee: Glass_saga (Masaki Matsushita)
* Target version: 3.0
* ruby -v: ruby 2.8.0dev (2020-08-27T07:39:13Z v3_0_0_preview1~397 2038cc6cab) [x86_64-linux]
* Backport: 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONTNEED
----------------------------------------
One of the Rails CI, Active Job integration test with sidekiq, against Ruby 3.0.0 has been failing since August 30, 2020.

According to `git bisect` is is triggered by 2038cc6cab6ceeffef3ec3a765c70ae684f829ed . Somehow this issue only reproduces with Ruby on Docker like `rubylang/ruby:master-nightly-bionic`
It does not reproduce if Ruby is installed locally using `rbenv install 3.0.0-dev` on Ubuntu 20.04 and macOS 11 beta.

### The first failed build job

https://buildkite.com/rails/rails/builds/71321#84b29655-b3df-4b5c-8b20-cbf15ecd9653

``` ruby
Ruby          2.8.0p-1 (2020-08-29 revision d7492a0be885ea9f2b9f71e3e95582f9a859c439) [x86_64-linux]

```

### The last successful build job
https://buildkite.com/rails/rails/builds/71143#369217f7-95f6-4ab9-8ef5-7c6364bd803e

```ruby
Ruby          2.8.0p-1 (2020-08-20 revision a74df67244199d1fd1f7a20b49dd5a096d2a13a2) [x86_64-linux]
```

### `git bisect` result

Performed `git bisect` ruby/ruby repository and it says `2038cc6cab6ceeffef3ec3a765c70ae684f829ed` triggers this build failure.

## Steps to reproduce

1. Install Docker
2. Install Ruby 2.7.1 (or whatever Ruby version to run rake)
3. Create Ruby docker image for 2038cc6cab6ceeffef3ec3a765c70ae684f829ed

```
git clone https://github.com/ruby/ruby-docker-images.git
cd ruby-docker-images
rake docker:build ruby_version=master:2038cc6cab6ceeffef3ec3a765c70ae684f829ed
```

4. Run Rails CI using the Docker image created in step 3

```
cd ~
git clone https://github.com/rails/rails.git
cd rails
git clone https://github.com/rails/buildkite-config .buildkite/
RUBY_IMAGE=rubylang/ruby:master-2038cc6cab6ceeffef3ec3a765c70ae684f829ed-bionic docker-compose -f .buildkite/docker-compose.yml build base && CI=1 docker-compose -f .buildkite/docker-compose.yml run default runner activejob 'AJ_ADAPTER=sidekiq AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 5170'
```

## Actual result

```
Using sidekiq
Run options: --seed 5170

# Running:

.SSSF

Failure:
QueuingTest#test_should_run_job_enqueued_in_the_future_at_the_specified_time [/rails/activejob/test/integration/queuing_test.rb:76]:
Expected false to be truthy.


bin/test test/integration/queuing_test.rb:71

.F

Failure:
QueuingTest#test_should_run_jobs_enqueued_on_a_listening_queue [/rails/activejob/test/integration/queuing_test.rb:14]:
Expected false to be truthy.


bin/test test/integration/queuing_test.rb:11

.SS..F

Failure:
QueuingTest#test_current_locale_is_kept_while_running_perform_later [/rails/activejob/test/integration/queuing_test.rb:102]:
Expected false to be truthy.


bin/test test/integration/queuing_test.rb:93

F

Failure:
QueuingTest#test_current_timezone_is_kept_while_running_perform_later [/rails/activejob/test/integration/queuing_test.rb:119]:
Expected false to be truthy.


bin/test test/integration/queuing_test.rb:110

..

Finished in 34.153644s, 0.4392 runs/s, 0.3514 assertions/s.
15 runs, 12 assertions, 4 failures, 0 errors, 5 skips

You have skipped tests. Run with --verbose for details.
```


## Expected result

It should success as the Ruby as of the previous commit `1035a3b202ee86bf2b0a1d00eefcfff0d7ab9f6b` does.

```
$ RUBY_IMAGE=rubylang/ruby:master-1035a3b202ee86bf2b0a1d00eefcfff0d7ab9f6b-bionic docker-compose -f .buildkite/docker-compose.yml build base && CI=1 docker-compose -f .buildkite/docker-compose.yml run default runner activejob 'AJ_ADAPTER=sidekiq AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 5170'
```

```
+++
+++ activejob: AJ_ADAPTER=sidekiq AJ_INTEGRATION_TESTS=true bin/test test/integration/queuing_test.rb --seed 5170
Using sidekiq
Run options: --seed 5170

# Running:

.SSS....SS.....

Finished in 13.647623s, 1.0991 runs/s, 1.0258 assertions/s.
15 runs, 14 assertions, 0 failures, 0 errors, 5 skips

You have skipped tests. Run with --verbose for details.
```




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>