Thank you. This is very interesting, informative and educational. Good 
look with getting your patch accepted.


On 11/02/2016 06:41 PM, funny.falcon / gmail.com wrote:
> Issue #12142 has been updated by Yura Sokolov.
>
> File hash_improvements_additional.mbox added
>
> ----------------------------------------
> Feature #12142: Hash tables with open addressing
> https://bugs.ruby-lang.org/issues/12142#change-61184
>
> * Author: Vladimir Makarov
> * Status: Open
> * Priority: Normal
> * Assignee:
> ----------------------------------------
> ~~~
>   Hello, the following patch contains a new implementation of hash
> tables (major files st.c and include/ruby/st.h).
>
>    Modern processors have several levels of cache.  Usually,the CPU
> reads one or a few lines of the cache from memory (or another level of
> cache).  So CPU is much faster at reading data stored close to each
> other.  The current implementation of Ruby hash tables does not fit
> well to modern processor cache organization, which requires better
> data locality for faster program speed.
>
> The new hash table implementation achieves a better data locality
> mainly by
>
>    o switching to open addressing hash tables for access by keys.
>      Removing hash collision lists lets us avoid *pointer chasing*, a
>      common problem that produces bad data locality.  I see a tendency
>      to move from chaining hash tables to open addressing hash tables
>      due to their better fit to modern CPU memory organizations.
>      CPython recently made such switch
>      (https://hg.python.org/cpython/file/ff1938d12240/Objects/dictobject.c).
>      PHP did this a bit earlier
>      https://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html.
>      GCC has widely-used such hash tables
>      (https://gcc.gnu.org/svn/gcc/trunk/libiberty/hashtab.c) internally
>      for more than 15 years.
>
>    o removing doubly linked lists and putting the elements into an array
>      for accessing to elements by their inclusion order.  That also
>      removes pointer chaising on the doubly linked lists used for
>      traversing elements by their inclusion order.
>
> A more detailed description of the proposed implementation can be
> found in the top comment of the file st.c.
>
> The new implementation was benchmarked on 21 MRI hash table benchmarks
> for two most widely used targets x86-64 (Intel 4.2GHz i7-4790K) and ARM
> (Exynos 5410 - 1.6GHz Cortex-A15):
>
> make benchmark-each ITEM=bm_hash OPTS='-r 3 -v' COMPARE_RUBY='<trunk ruby>'
>
> Here the results for x86-64:
>
> hash_aref_dsym       1.094
> hash_aref_dsym_long          1.383
> hash_aref_fix        1.048
> hash_aref_flo        1.860
> hash_aref_miss       1.107
> hash_aref_str        1.107
> hash_aref_sym        1.191
> hash_aref_sym_long           1.113
> hash_flatten         1.258
> hash_ident_flo       1.627
> hash_ident_num       1.045
> hash_ident_obj       1.143
> hash_ident_str       1.127
> hash_ident_sym       1.152
> hash_keys            2.714
> hash_shift           2.209
> hash_shift_u16       1.442
> hash_shift_u24       1.413
> hash_shift_u32       1.396
> hash_to_proc         2.831
> hash_values          2.701
>
> The average performance improvement is more 50%.  ARM results are
> analogous -- no any benchmark performance degradation and about the
> same average improvement.
>
> The patch can be seen as
>
> https://github.com/vnmakarov/ruby/compare/trunk...hash_tables_with_open_addressing.patch
>
> or in a less convenient way as pull request changes
>
> https://github.com/ruby/ruby/pull/1264/files
>
>
> This is my first patch for MRI and may be my proposal and
> implementation have pitfalls.  But I am keen to learn and work on
> inclusion of this code into MRI.
>
> ~~~
>
> ---Files--------------------------------
> 0001-st.c-change-st_table-implementation.patch (59.4 KB)
> st-march31.patch (114 KB)
> base.patch (93.8 KB)
> hash.patch (4.48 KB)
> strong_hash.patch (8.08 KB)
> city.patch (19.4 KB)
> new-hash-table-benchmarks.patch (1.34 KB)
> hash_improvements_and_st_implementation_changes.mbox (101 KB)
> hash_improvements_and_st_array_with_open_addressing.mbox (108 KB)
> hash_improvements_additional.mbox (5.09 KB)
>
>


Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>