eregontp / gmail.com wrote:
> normalperson (Eric Wong) wrote:
> >  It's been a known problem for decades, now (at least since the
> >  days of mod_perl + DBI on Apache 1.x); and AFAIK there's no data
> >  leaks from it.  Anybody who makes that mistake would likely
> >  raise/crash/burn long before seeing, much less leaking sensitive data.
> 
> Yes, it's not a new problem.
> I disagree about no production leaks, because it happened to me on a website running for a national programming contest.
> For most of the contest it was fine as one process was able to handle the load,
> but at some point the webserver decided to spawn another process by forking,
> people starting seeing each's other solution, the scores were corrupted and everyone was puzzled as to what happened.
> We had to stop the contest due to this.

fork is full of caveats; using atfork hook to work around one
caveat out of many is not a solution.  The solution is knowing
the caveats of the tools you use.

In your case, it seemed like you were not paying attention to
the server setup at all and would not have known to use atfork
hook regardless if it was in the webserver or core.

> I want to help protect future programmers from such bugs, if at all possible.
> And I believe it's possible.
> 
> >  I agree with Jeremy on this; it will likely cause new problems
> >  and surprises if libraries use it.
> 
> Let's design it so it doesn't.
> What's the harm/surprise to reconnect in at_fork(:after_child) for instance?

It's a waste of time and resources when the child has no need for
a connection at all.  Simply put, library authors have no clue
how an application will use/manage processes and would
(rightfully) not bother using such hooks.

Also, consider that pthread_atfork has been around for many
years, it's not adopted by library authors (of C/C++ libraries)
because of problems surrounding it; and POSIX is even
considering deprecating pthread_atfork[1].

How about an alternate proposal?

	Introduce a new object_id-like identifier which changes
	across fork: Thread.current.thread_id

It doesn't penalize platforms without fork, and can work well
with existing thread-aware code.

IMHO, Thread.current.object_id being stable in forked child
isn't good; but I expect compatibility problems if we change it
at this point.  At least some usages of monitor.rb would
break.

> The current hooks are webserver-specific and so migrating
> between unicorn/puma/passenger/etc means it's quite easy to
> forget to adapt to the new webserver hook, which would trigger
> this bug.

I hate the amount of vendor lock-in each webserver has.
But making hooks which library authors can fire unpredictably
on application authors is worse, especially if there's no
"opt-out".


[1] http://austingroupbugs.net/view.php?id=858

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>