On 02/07/2018 06:19 PM, Eric Wong wrote:
> eregontp / gmail.com wrote:
>> The GIL guarantees all C code is executed with the GIL held, so C code always perfectly sees effects performed by C code of other threads
>> (except rb_thread_call_without_gvl but it is not used here, is it?).
> Correct, GVL protects common cases (string/fixnum/etc)
>
>> In this case: is the code during a Hash lookup, after calling Ruby's #hash/#eql? expecting some state to not have changed since before the call?
>> If so, it should be enough to re-check that state after the call.
> Yes, probably doable.  I thought about this while eating;
> but we can probably use rebuilds_num as a seqlock:
>
> 	unsigned int seq;
>
>     retry:
> 	seq = tab->rebuilds_num; /* needs barriers */
> 	...
> 	...->compare(a, b); /* may rebuild */
> 	if (tab->rebuilds_num != seq)
> 		goto retry;
>
> Not free in terms of overhead, but should still be cheap and
> shouldn't need atomics on the reader side; only memory barriers
> to ensure correct ordering with some CPUs.
>
>> When growing/shrinking a Hash table, the GIL should be held, so that should be observed atomically by other threads.
>> Or is it the problem that #rehash needs to call back to Ruby code?
> Right, the rebuild is done entirely with the GVL held, so that's
> not a problem.  It needs to update rebuild_num atomically
> for the above reader pseudocode to work, but that's a cold path.
>
> Vladimir: thoughts?
>

Yes, I think it could work.  It is probably a solution with minimal 
possible overhead.  Besides compare, there are other external functions 
called by st.c (e.g. func in st_insert2 or st_general_foreach) but 
probably they don't go from C  land to Ruby land (where the thread 
switch can happen and the table modified or even eql method would be 
written to modify the table which would be a very weird code).  As I saw 
these functions are used to inform GC only.  But it worth to add 
additional asserts to check it.

Strange that the old hash tables had no such code already.  Probably we 
were lucky that the bug did not triggered before.

I'll work on it.  I think the patch will be ready on the next week. I 
need to evaluate a performance impact too.


Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>