Issue #921 has been updated by Charles Nutter.


The concurrent require issue is separate. I would like to understand what 1.9.3 does to make concurrent requires safe. A global lock around any require? I know there was a lock against specific filenames added at some point (which JRuby also does)...is there something more?

I do not understand how the autoload problem was fixed. Here is a simpler example...

autoload.rb:

class Object
  autoload :X, 'constant.rb'
end

Thread.abort_on_exception = true

Thread.new {
  puts "thread #{Thread.current} accessing X; defined? X == #{(defined? X).inspect}"
  X
}
Thread.new {
  puts "thread #{Thread.current} accessing X; defined? X == #{(defined? X).inspect}"
  X
}
sleep


constant.rb:

# simulate a slow file load or a deep chain of requires
puts "thread #{Thread.current} in constant.rb"

# check that X is not defined
puts "X defined: #{(defined? X).inspect}"

1_000_000.times { Thread.pass }

class Object
  # define X
  X = 1
end

I will review the problem.

When autoloading is triggered, the first step is to remove the autoload constant. This allows the file being required to see a blank slate when (presumably) defining the constant attached to autoloading. The above example does indeed print out "X defined: nil".

Now if we have two threads that encounter an autoloaded constant at roughly the same time, I would expect this sequence is possible:

For our autoloaded constant X:

1. Thread A encounters the autoload for X and removes the constant. It proceeds to require the associated file which (eventually) will define X.
2. Because thread A is now in Ruby code, a context switch can occur to thread B.
3. Thread B attempts to access the autoloaded constant X; however it has been removed by A, and thread B gets a NameError. It does *not* wait for the require to complete, because it never attempts to perform a require.

This is what happens in JRuby, Rubinius, and Ruby 1.8.7. It does not happen in Ruby 1.9.2 or MacRuby. Why?

Here is the output under 1.8.7:

~/projects/jruby ??? ruby autoload.rb 
thread #<Thread:0x100169dc8> accessing X; defined? X == "constant"
thread #<Thread:0x100169dc8> in constant.rb
X defined: nil
thread #<Thread:0x100169080> accessing X; defined? X == nil
autoload.rb:13: uninitialized constant X (NameError)
	from autoload.rb:11:in `initialize'
	from autoload.rb:11:in `new'
	from autoload.rb:11

What appears to happen is that thread B encounters the X autoload and pauses. This could be explained for concurrent requires of the same file (if synchronizing against a filename), or for concurrent requires of different files (if there's a single global lock) but that still does not explain why autoload behaves this way in 1.9.2. Why? Because if thread A is *actually* removing the constant, it should never know that it's an autoload constant and should fail to block on a require lock of any kind. And it does appear that the constant has been removed...so how does thread B know to pause?

I could dig through the MRI code to find this, but perhaps can someone describe in simple terms how 1.9.2 prevents thread B in the example above from running when it encounters X?
----------------------------------------
Bug #921: autoload is not thread-safe
http://redmine.ruby-lang.org/issues/921

Author: Charles Nutter
Status: Closed
Priority: Normal
Assignee: Nobuyoshi Nakada
Category: 
Target version: 
ruby -v: -


=begin
 Currently autoload is not safe to use in a multi-threaded application. To put it more bluntly, it's broken.
 
 The current logic for autoload is as follows:
 
 1. A special object is inserted into the target constant table, used as a marker for autoloading
 2. When that constant is looked up, the marker is found and triggers autoloading
 3. The marker is first removed, so the constant now appears to be undefined if retrieved concurrently
 4. The associated autoload resource is required, and presumably redefines the constant in question
 5. The constant lookup, upon completion of autoload, looks up the constant again and either returns its new value or proceeds with normal constant resolution
 
 The problem arises when two or more threads try to access the constant. Because autoload is stateful and unsynchronized, the second thread may encounter the constant table in any number of states:
 
 1. It may see the autoload has not yet fired, if the first thread has encountered the marker but not yet removed it. It would then proceed along the same autoload path, requiring the same file a second time.
 2. It may not find an autoload marker, and assume the constant does not exist.
 3. It may see the eventual constant the autoload was intended to define.
 
 Of these combinations, (3) is obviously the desired behavior. (1) can only happen on native-threaded implementations that do not have a global interpreter lock, since it requires concurrency during autoload's internal logic. (2) can happen on any implementation, since while the required file is processing the original autoload constant appears to be undefined.
 
 I have only come up with two solutions:
 
 * When the autoload marker is encountered, it is replaced (under lock) with an "autoload in progress" marker. All subsequent threads will then see this marker and wait for the autoloading process to complete. the mechanics of this are a little tricky, but it would guarantee concurrent autoloads would only load the target file once and would always return the intended value to concurrent readers.
 * A single autoload mutex, forcing all autoloads to happen in serial.
 
 There is a potential for deadlock in the first solution, unfortunately, since two threads autoloading two constants with circular autoloaded constant dependencies would ultimately deadlock, each waiting for the other to complete. Because of this, a single autoload mutex for all autoloads may be the only safe solution.
=end



-- 
http://redmine.ruby-lang.org