Issue #13794 has been updated by catphish (Charlie Smurthwaite).


>> How about checking owner_process before incrementing?

> I'm afraid this fix doesn't quite match up in my mind. To clarify, I am suggesting that timer_thread_pipe.writing is being incremented in the parent process before the fork occurs. This would still occur because the PID would match at that point.


Checking it before the while loop might work though.

----------------------------------------
Bug #13794: Infinite loop of sched_yield
https://bugs.ruby-lang.org/issues/13794#change-66121

* Author: catphish (Charlie Smurthwaite)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.3.4p301 (2017-03-30 revision 58214) [x86_64-linux]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
I have been encountering an issue with processes hanging in an infinite loop of calling sched_yield(). The looping code can be found at https://github.com/ruby/ruby/blob/v2_3_4/thread_pthread.c#L1663

while (ATOMIC_CAS(timer_thread_pipe.writing, (rb_atomic_t)0, 0)) {
  native_thread_yield();
}

It is my belief that by some mechanism I have not been able to identify, timer_thread_pipe.writing is incremented but it never decremented, causing this loop to run infinitely.

I am not able to create a reproducible test case, however this issue occurs regularly in my production application. I have attached backtraces and thread lists from 2 processes exhibiting this behaviour. gdb confirms that timer_thread_pipe.writing = 1 in these processes.

I believe one possibility of the cause is that rb_thread_wakeup_timer_thread() or rb_thread_wakeup_timer_thread_low() is called, and before it returns, another thread calls fork(), leaving the value of timer_thread_pipe.writing incremented, but leaving behind the thread that would normally decrement it.

If this is correct, one solution would be to reset timer_thread_pipe.writing to 0 in native_reset_timer_thread() immediately after a fork.

Other examples of similar bugs being reported:
https://github.com/resque/resque/issues/578
https://github.com/zk-ruby/zk/issues/50

---Files--------------------------------
backtrace_1.txt (14 KB)
backtrace_2.txt (10.9 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>