```
tenderlove / ruby-lang.org wrote:
> normalperson (Eric Wong) wrote:
> > eregontp / gmail.com wrote:
> >  > I want to help protect future programmers from such bugs, if at all possible.
> >  > And I believe it's possible.
> >  > 
> >  > >  I agree with Jeremy on this; it will likely cause new problems
> >  > >  and surprises if libraries use it.
> >  > 
> >  > Let's design it so it doesn't.
> >  > What's the harm/surprise to reconnect in at_fork(:after_child) for instance?
> >  
> >  It's a waste of time and resources when the child has no need for
> >  a connection at all.  Simply put, library authors have no clue
> >  how an application will use/manage processes and would
> >  (rightfully) not bother using such hooks.
> 
> I think library authors can make things easier though.  Web
> frameworks, like Rails for example, are expected to handle
> this situation for the user.  In addition, say a library
> author provided no such feature like Sequel, how would a user
> know they need to call `DB.disconnect` after a fork?  Are they
> responsible for completely understanding the implementation of
> the library they are using?  Even if an end user called
> `DB.disconnect` in an after fork hook, what if that wasn't
> enough?  How would an end user know what needs to be called?

I don't know how to deal with unknowledgeable users aside from
making them knowledgeable about the tools they use.

However, the issue also relates to allowing things to become
thread-safe and fiber-safe at the moment, and in the future:
Guild-safe.

So maybe ko1 can give his thoughts on this topic since he
is working on Guilds

> >  Also, consider that pthread_atfork has been around for many
> >  years, it's not adopted by library authors (of C/C++ libraries)
> >  because of problems surrounding it; and POSIX is even
> >  considering deprecating pthread_atfork[1].
> >  
> >  How about an alternate proposal?
> >  
> >  	Introduce a new object_id-like identifier which changes
> >  	across fork: Thread.current.thread_id
> >  
> >  It doesn't penalize platforms without fork, and can work well
> >  with existing thread-aware code.
> 
> I think this is a good idea, but I'm not sure it addresses the
> communication issue I brought up.  IMO it would be great to
> have some sort of hook so that library authors can dictate
> what "the right thing to do" is after a fork (maybe there are
> other resources or caches that need to be cleaned, and maybe
> that changes from version to version).

Library authors also do not know if the child process will touch
their objects/code; so making such assumptions can be costly
when child processes with different goals are used.

I think the only thing low-level library authors (e.g. authors
of libpq/sqlite3/libcurl/etc...) can do is document how to deal
with such issues when interacting with threads and processes.

Maybe Sequel and AR can have add optional PID checks
($$ != expected_pid), which is totally generic and usable
in bunch of cases.

One thing I have been annoyed by is END/at_exit hooks firing
unexpectedly in child processes.  I work around them with the
PID check.  So I want to avoid having the same annoyance from
at_exit by avoiding atfork hooks.

> >  IMHO, Thread.current.object_id being stable in forked child
> >  isn't good; but I expect compatibility problems if we change it
> >  at this point.  At least some usages of monitor.rb would
> >  break.
> >  
> >  > The current hooks are webserver-specific and so migrating
> >  > between unicorn/puma/passenger/etc means it's quite easy to
> >  > forget to adapt to the new webserver hook, which would trigger
> >  > this bug.
> >  
> >  I hate the amount of vendor lock-in each webserver has.
> >  But making hooks which library authors can fire unpredictably
> >  on application authors is worse, especially if there's no
> >  "opt-out".
> 
> I think requiring users to specify a db disconnect after fork
> causes even more "vendor lock-in".  Lets say I did add the
> after fork code to deal with Sequel, but now I want to switch
> to a threaded webserver.  Now I have to do more work to figure
> out what's required (if anything) in a threaded environment.
> It puts the onus on the app dev to figure out what's right for
> a particular environment, and that means it's harder to
> change: locking you in by making more work.

AFAIK, Sequel and AR already provides out-of-the-box thread-safe
behavior.  Perhaps Sequel and AR should also have out-of-the-box
thread-AND-fork-safe connection pool which checks the PID.  I
don't expect PID check cost to be expensive, but they can make
that an option for users and it'll work on all forking servers
and maybe Guild servers in the future.

People who don't use forking can use old thread-safe-only or
single-threaded-only code.

The key thing I want to avoid from having atfork hooks is:

        DB.disconnect
        pid = fork do
          # something which never touches DB
          exit!(0) # avoid at_exit hook
        end
        DB.reconnect

It's needlessly expensive in the parent process to pay the
disconnect/reconnect cost when forking a child which never
touches the DB.  And there's already a gotcha for avoiding
at_exit hooks.

> Additionally, forking servers all have to provide this type of
> hook anyway (Unicorn, Resque, Puma, to name a few) but today
> they have to specify their own API.  I think it would be great
> if we had a "Rack for fork hooks", if that makes sense.  :)

I don't think limiting this to fork is necessary; forking may
become obsolete with Guilds.
```

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>