On Tue, 12 Jun 2007 07:32:11 +0900, Joel VanderWerf <vjoel / path.berkeley.edu> wrote:
> So you're saying that, as the line
> 
>          @obj = Whatever.new
> 
> is executed, it is possible (in some ruby implementations, but not MRI
> 1.8) that @obj will be assigned the Whatever instance *before* the #new
> call completes? (That does seem to be what the wikipedia entry is
> warning about.[1])

More or less.  Some of this is a result of instruction reordering performed by both the compiler and the CPU, and some of this is the result of stuff not getting written out to shared memory right away.  Either way, it's possible for one thread's operations to appear in a different order to other threads.

The effect of a memory barrier is to bring all threads involved into sync, so to speak, so that whatever is going on behind the scenes, they all see things happening in the same order.  Memory barriers are the blue pill -- you really don't want to see how deep the concurrency rabbit hole goes.

> I wonder if the following will be a more efficient alternative, or worse
> because of the singleton method:
> 
>    def obj
>      @lock.synchronize do
>        @obj = Whatever.new
>        def self.obj
>          @obj
>        end
>      end
>      @obj
>    end
>
> I suppose that has the same problem, depending on implementation, in
> that the method definition could be reordered before the assignment...

There's also a race condition.  Consider:

 thread #1: enters #obj
 thread #2: enters #obj
 thread #2: acquires @lock
 thread #1: waits for @lock
 thread #2: @obj = Whatever.new (#<Whatever:0x1234abcd>)
 thread #2: redefines obj
 thread #2: releases @lock
 thread #2: returns @obj (#<Whatever:0x1234abcd>)
 thread #1: wakes up
 thread #1: acquires @lock
 thread #1: @obj = Whatever.new (#<Whatever:0xcafebeef>)
 thread #1: redefines obj
 thread #1: releases @lock
 thread #1: returns @obj (#<Whatever:0xcafebeef>)

Sometimes when writing multi-threaded code, it helps to pretend that the thread scheduler hates your guts and has it in for you personally.  I think you'd have to do something like this at least:

 def obj
   @lock.synchronize do
     @obj ||= Whatever.new
     def self.obj
       @obj
     end
     @obj
   end
 end

But there's still the un-synchronized read of @obj from within the singleton method, which can even occur before @lock has been released by the writing thread.  It might often work in practice, since the Ruby implementation is (hopefully) doing some kind of synchronization when reading/updating its method tables, preventing the CPU from reordering things across the method definition.  But a lot depends on the memory model, and there's some pretty crazy stuff out there (*cough* Alpha *cough)...

It's probably best to stick to the simplest correct way:

 def obj
   @lock.synchronize do
     @obj ||= Whatever.new
   end
 end

If the locking turns out to be a bottleneck, then the method's callers can cache its result in variables private to their respective threads, which will not require any locking at all to read.

-mental