What about the case where you have two distinct classes of threads:

1) needs up-to-date data

2) tolerates stale data, just wants to read as fast as possible
   without blocking itself or other readers/writers

?

I consider supporting 2) important for scalability.

I'd rather have specialized methods for this:
atomic_set, atomic_read, etc...

But on the other hand, maybe an ultra-high-level language like Ruby
shouldn't rely on a shared memory model at all and favor message-passing
for safety instead.