takashikkbn / gmail.com wrote:
> @normalperson By the way, is there any plan to apply
> `rb_f_system`-like changes to `rb_f_spawn` as well? Many of
> "1. waitpid" deadlocks seem to come from a process created by
> `rb_f_spawn`, and also I guess those waitpid-related race
> conditions may also result in "2. in ruby_cleanup" deadlocks
> as well. So it would be worth taking a look since we may be
> releasing 2.6.0 preview3 shortly.

I'm not sure that's the problem, actually; and holding
vm->waitpid_lock across two Ruby method calls won't work.

Fwiw, the missing locks ([ruby-core:89629]) leading to data
corruption could be causing some these timeouts/pauses.

> I briefly took a look, but at least we can't use `alloca` for
> `waitpid_state` on `rb_f_spawn` and so the code for it would
> be slightly different from `rb_f_system`'s one. I'll leave it
> to you since you're more familiar with the current
> implementation (and possible race conditions) around waitpid.

AFAIK, the waitpid code has been good for a while until you
started using postponed job, right?  I was mostly away for
a few weeks along with several computer hardware problems.

Since the waitpid problems seems new, that leads me to more
strongly suspect MJIT is clobbering some memory which the
waitpid/locking stuff relies on.

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>