Issue #12020 has been updated by Petr Chalupa.


Thank you, for taking time to read it and for your input. I apologise for delayed answer, I was rather busy lately.

>  >    ●volatility (V) - A written value is immediately visible to any
>  >    subsequent volatile read of the same variable on any Thread. It has
>  >    same meaning as in Java, it provides sequential consistency. A volatile
>  >    write happens-before any subsequent volatile read of the same variable.
>  
>  Perhaps we call this "synchronous" or "coherent" instead.
>  The word "volatile" is highly misleading and confusing to me
>  as a C programmer.  (Perhaps I am easily confused :x)

We can definitely consider a different name. I would defer it for later though, to avoid confusion now.

>  Anyways, I am not convinced (volatile|synchronous|coherent) access
>  should happen anywhere by default for anything because of costs.
>  
>  Those requiring synchronized data should use special method calls
>  to ensure memory ordering.

I've added following paragraph to the document explaining a little bit why volatility is preferred.

"The volatile property has noticeable impact on performance, on the other hand it’s often quite convenient property, since it simplifies reasoning about the program. Therefore unless it presents a performance issue volatility is preferred."

It tries to be in alignment with rest of the Ruby language to be user-friendly. Therefore the volatility behaviour is on Constants and similar. I've also elaborate in the document in the Constants part why is there no performance loose by making them volatile: "Ruby implementations may take advantage of constancy of the variables to avoid doing volatile reads on each constant variable read. MRI can check a version number. JRuby can use SwitchPoint, and JRuby+Truffle can use Assumptions, where both allow to thread the values as real constants during compilation."

>  
>  >    Constant variables
>  >    ●volatility - yes
>  >    ●atomicity - yes
>  >    ●serializability - yes
>  >    ●scope - a module
>  >    A Module or a Class definition is actually a constant definition. The
>  >    definition is atomic, it assigns the Module or the Class to the
>  >    constant, then its methods are defined atomically one by one.
>  >    It’s desirable that once a constant is defined it and its value is
>  >    immediately visible to all threads, therefore it’s volatile.
>  
>  <snip (thread|fiber)-local, no objections there>
>  
>  >    Method table
>  >    ●volatility - yes
>  >    ●atomicity - yes
>  >    ●serializability - yes
>  >    ●scope - a class
>  >    Methods are also stored where operations defacto are: read -> method
>  >    lookup, write -> method redefinition, define -> method definition,
>  >    undefine -> method removal. Operations over method tables have to be
>  >    visible as soon as possible otherwise Threads could execute different
>  >    versions of methods leading to unpredictable behaviour, therefore they
>  >    are marked volatile. When a method is updated and the method is being
>  >    executed by a thread, the thread will finish the method body and it’ll
>  >    use the updated method obtained on next method lookup.
>  
>  I strongly disagree with volatility in method and constant tables.  Any
>  programs defining methods/constants in parallel threads and expecting
>  them to be up-to-date deserve all the problems they get.

I see that this approach would be easier for Ruby implementers, on the other hand it would create very hard to debug bugs for users. Even though I agree that they should not do parallel loading, I would still like to protect them. Making both volatile should have only minor impact on code loading, if that's not the case, it should definitely be reconsidered.  

>  
>  Maybe volatility for require/autoload is a special case only iff a
>  method/constant is missing entirely; but hitting old methods/constants
>  should be allowed by the implementation.

Volatility of require/autoload and the fact that it blocks when other thread is loading given file/constant are very useful in parallel environment to make sure that some feature/class is fully loaded before using it. Both are usually used only on program paths which run only once during loading or reloading, therefore there are not performance critical.

>  
>  Methods (and all other objects) are already protected from memory
>  corruption and use-after-free by GC.  There is no danger in segfaulting
>  when old/stale methods get run.
>  
>  The inline, global (, and perhaps in the future: thread-specific)
>  caches will all become expensive if we need to ensure read-after-write
>  consistency by checking for changes on methods and constants made
>  by other threads.

I've tried to explain little bit in the document, this should not have any overhead. MRI with GIL does not have to ensure read-after-write consistency, other compiling implementations are actively invalidating the compiled code if it depends on a constant which was just redefined (or a method).

I did not entirely understood why you are against volatility on constants and methods, I tried to explain better why they are suggested to be volatile though. Could you elaborate?

>  
>  >    Threads
>  >    Threads have the same guarantees as in in Java. Thread.new
>  >    happens-before the execution of the new thread’s block. All operations
>  >    done by the thread happens-before the thread is joined. In other words,
>  >    when a thread is started it sees all changes made by its creator and
>  >    when a thread is joined, the joining thread will see all changes made
>  >    by the joined thread.
>  
>  Good.  For practical reasons, this should obviate the need for
>  constant/method volatility specified above.

It would certainly help if they wouldn't be volatile, require and autoload guaranties as well.

>  
>  >    Beware of requiring and autoloading in concurrent programs, it's
>  >    possible to see partially defined classes. Eager loading or blocking
>  >    until classes are fully loaded should be used to mitigate.
>  
>  No disagreement, here :)



----------------------------------------
Feature #12020: Documenting Ruby memory model
https://bugs.ruby-lang.org/issues/12020#change-56885

* Author: Petr Chalupa
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
Defining a memory model for a language is necessary to be able to reason about a program behavior in a concurrent or parallel environment. 

There was a document created describing a Ruby memory model for concurrent-ruby gem, which fits several Ruby language implementations. It was necessary to be able to build lower-level unifying layer that enables creation of concurrency abstractions. They can be implemented only once against the layer, which ensures that it runs on all Ruby implementations.

The Ruby MRI implementation has stronger undocumented guaranties because of GIL semantics than the memory model, but the few relaxations from MRIs behavior allow other implementations to fit the model as well and to improve performance.

This issue proposes to document the Ruby memory model. The above mentioned memory model document which was created for concurrent-ruby can be used as a starting point: https://docs.google.com/document/d/1pVzU8w_QF44YzUCCab990Q_WZOdhpKolCIHaiXG-sPw/edit#. Please comment in the document or here.

The aggregating issue of this effort can be found [here](https://bugs.ruby-lang.org/issues/12019).



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>