Issue #5446 has been updated by jeremyevans0 (Jeremy Evans).


tenderlovemaking (Aaron Patterson) wrote:
> I think library authors can make things easier though.  Web frameworks, like Rails for example, are expected to handle this situation for the user.  In addition, say a library author provided no such feature like Sequel, how would a user know they need to call `DB.disconnect` after a fork?  Are they responsible for completely understanding the implementation of the library they are using?  Even if an end user called `DB.disconnect` in an after fork hook, what if that wasn't enough?  How would an end user know what needs to be called?

Preloading before forking is not the default in unicorn or puma, and the documentation for both unicorn and puma mention the related issues in the documentation that describes how to enable preloading before forking.  Additionally, Sequel's documentation mentions the need to disconnect when using preloading before forking.  Users should not be expected to understand the implementation of the libraries they are using, but they should be expected to read the documentation for non-default configuration options they explicitly enable.

Now, passenger enables preloading before forking by default.  Some people may consider that an issue with passenger or others may be OK with them just optimizing for the case where only ActiveRecord is used without any other libraries that would allocate file descriptors.  To be fair, passenger's documentation does discuss the issue in detail.

FWIW, disconnecting (at least with Sequel) should be done before forking, not after forking.  After forking you are already sharing the file descriptors.  As long as the library can reestablish connections as needed (as Sequel can), a single line of code to disconnect before fork is globally much simpler than extensive code that tries (and fails) to handle all possible cases when fork can be used.

I will posit that any attempt to disconnect automatically during forking (either in the parent or child) can break a reasonable (if less common) setup where forking is used correctly without explicit disconnection.

> >  How about an alternate proposal?
> >  
> >  	Introduce a new object_id-like identifier which changes
> >  	across fork: Thread.current.thread_id
> >  
> >  It doesn't penalize platforms without fork, and can work well
> >  with existing thread-aware code.
> 
> I think this is a good idea, but I'm not sure it addresses the communication issue I brought up.  IMO it would be great to have some sort of hook so that library authors can dictate what "the right thing to do" is after a fork (maybe there are other resources or caches that need to be cleaned, and maybe that changes from version to version).

At least for Sequel, I don't think this will matter.  I'm neither for nor against this.

> Additionally, forking servers all have to provide this type of hook anyway (Unicorn, Resque, Puma, to name a few) but today they have to specify their own API.  I think it would be great if we had a "Rack for fork hooks", if that makes sense.  :)

I think the main problem here is that libraries may offer `before_fork` or `after_fork` hooks that are called with arguments that a in-core `at_fork` hook couldn't use.  I know that is the case with unicorn.

Additionally, there is the issue with in-core `at_fork` hooks in terms of inheritance:

~~~ ruby
  fork do # calls at_fork hooks
    fork do # calls at_fork hooks?
    end
  end
~~~

There are arguments for inheriting the hooks, and there are arguments for not doing so.  I suppose you could make it configurable, but that increases complexity.  When fork hooks are implemented in a library, they apply to only the case where the library forks, and not all cases where `fork` is called.

Consider that you are using a preforking webserver, and in your parent process, you do:

~~~ ruby
r, w = IO.pipe
if fork
  # ...
else
  # ...
  exit!
end  
~~~

If an in-core `at_fork` handler is used, the webserver's forking hooks may be called for this fork, when that may not be desired.  That is not the case for library-specific fork hooks that wrap the library's usage of `fork`.

----------------------------------------
Feature #5446: at_fork callback API
https://bugs.ruby-lang.org/issues/5446#change-73088

* Author: normalperson (Eric Wong)
* Status: Assigned
* Priority: Normal
* Assignee: kosaki (Motohiro KOSAKI)
* Target version: 
----------------------------------------
It would be good if Ruby provides an API for registering fork() handlers.

This allows libraries to automatically and agnostically reinitialize resources
such as open IO objects in child processes whenever fork() is called by a user
application.  Use of this API by library authors will reduce false/improper
sharing of objects across processes when interacting with other
libraries/applications that may fork.

This Ruby API should function similarly to pthread_atfork() which allows
(at least) three different callbacks to be registered:

1) prepare - called before fork() in the original process
2) parent - called after fork() in the original process
3) child - called after fork() in the child process

It should be possible to register multiple callbacks for each action
(like at_exit and pthread_atfork(3)).

These callbacks should be called whenever fork() is used:

- Kernel#fork
- IO.popen
- ``
- Kernel#system

... And any other APIs I've forgotten about

I also want to consider handlers that only need to be called for plain
fork() use (without immediate exec() afterwards, like with `` and system()).

Ruby already has the internal support for most of this this to manage mutexes,
Thread structures, and RNG seed.  Currently, no external API is exposed.  I can
prepare a patch if an API is decided upon.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>