Issue #921 has been updated by Hiroshi Nakamura.


=begin

Here's the updated patch: [https://github.com/nahi/ruby/compare/11667b9c...03ddf439]

Summary

 * ((*What's the problem?*)) autoload is thread unsafe. When we define a constant to be autoloaded, we expect the constant construction is invariant. But current autoload implementation allows other threads to access the constant while the first thread is loading a file. See http://prezi.com/ff9yptxhohjz/making-autoload-thread-safe/ for an example.

 * ((*What's happening inside?*)) The current implementation uses Qundef as a marker of autoload in Constant table. Once the first thread find Qundef as a value at constant lookup, it starts loading a defined feature. Generally a loaded file overrides the Qundef in Constant table by module/class declaration at very beginning lines of the file, so other threads can see the new Module/Class object before feature loading is finished. It breaks invariant construction.

 * ((*How to solve?*)) To ensure invariant constant construction, we need to override Qundef with defined Object after the feature loading. For keeping Qundef in Constant table, I expanded autoload_data struct in Module to have a slot for keeping the defined object while feature loading. And changed Module's constant lookup/update logic a little so that the slot is only visible from the thread which invokes feature loading. (== the first thread which accessed the autoload constant)

 * ((*Evaluation?*)) All test passes (bootstrap test, test-all and RubySpec) and added 8 tests for threading behavior. Extra logics are executed only when Qundef is found, so no perf drop should happen except autoloading.

I'll commit this soon. Committers, please evaluate this.
=end

----------------------------------------
Bug #921: autoload is not thread-safe
http://redmine.ruby-lang.org/issues/921

Author: Charles Nutter
Status: Open
Priority: Normal
Assignee: Hiroshi Nakamura
Category: core
Target version: 1.9.x
ruby -v: ruby 1.8.7 (2011-02-18 patchlevel 334) [x86_64-linux]


=begin
 Currently autoload is not safe to use in a multi-threaded application. To put it more bluntly, it's broken.
 
 The current logic for autoload is as follows:
 
 1. A special object is inserted into the target constant table, used as a marker for autoloading
 2. When that constant is looked up, the marker is found and triggers autoloading
 3. The marker is first removed, so the constant now appears to be undefined if retrieved concurrently
 4. The associated autoload resource is required, and presumably redefines the constant in question
 5. The constant lookup, upon completion of autoload, looks up the constant again and either returns its new value or proceeds with normal constant resolution
 
 The problem arises when two or more threads try to access the constant. Because autoload is stateful and unsynchronized, the second thread may encounter the constant table in any number of states:
 
 1. It may see the autoload has not yet fired, if the first thread has encountered the marker but not yet removed it. It would then proceed along the same autoload path, requiring the same file a second time.
 2. It may not find an autoload marker, and assume the constant does not exist.
 3. It may see the eventual constant the autoload was intended to define.
 
 Of these combinations, (3) is obviously the desired behavior. (1) can only happen on native-threaded implementations that do not have a global interpreter lock, since it requires concurrency during autoload's internal logic. (2) can happen on any implementation, since while the required file is processing the original autoload constant appears to be undefined.
 
 I have only come up with two solutions:
 
 * When the autoload marker is encountered, it is replaced (under lock) with an "autoload in progress" marker. All subsequent threads will then see this marker and wait for the autoloading process to complete. the mechanics of this are a little tricky, but it would guarantee concurrent autoloads would only load the target file once and would always return the intended value to concurrent readers.
 * A single autoload mutex, forcing all autoloads to happen in serial.
 
 There is a potential for deadlock in the first solution, unfortunately, since two threads autoloading two constants with circular autoloaded constant dependencies would ultimately deadlock, each waiting for the other to complete. Because of this, a single autoload mutex for all autoloads may be the only safe solution.
=end



-- 
http://redmine.ruby-lang.org