Issue #9425 has been updated by Eric Wong.


 normalperson / yhbt.net wrote:
 >    test and verify compare_by_identity performance
 >  
 >  I'm comfortable that ID, string and most objects will hash well with
 >  power-of-two; but compare_by_identity, weakmap, vm->living_threads may
 >  hash less well without a prime modulo (or maybe they hash badly
 >  regardless of modulo!)
 
 OK, I was right about compare_by_identity being worse with power-of-two,
 but I fixed it by tweaking numhash:
 
 http://bogomips.org/ruby.git/patch?id=1579e9d0d82789
 
 I was wrong about IDs hashing well before, they hash OK now :)
 
 results: http://80x24.org/bmlog-20140303-034047.26775.txt
 the hash parts:
 	hash_aref_miss	1.048
 	hash_aref_str	1.162
 	hash_aref_sym	1.000
 	hash_flatten	1.092
 	hash_ident_num	1.007
 	hash_ident_obj	1.098
 	hash_ident_str	1.106
 	hash_ident_sym	1.018
 	hash_keys	1.000
 	hash_shift	1.011
 	hash_values	1.011
 	vm2_bighash*	1.183
 
 These numbers are from my weaker AMD FX-8320 which gave me worse numbers
 than my Haswell machine in my original test.  I'll try to test on my
 Haswell machine soon (network outage there :<).
 
 I'd like to commit the following three patches soon:
 http://bogomips.org/ruby.git/patch?id=a3fde671ffeec8 new hash benchmarks
 http://bogomips.org/ruby.git/patch?id=8f155afef61342 original patch
 http://bogomips.org/ruby.git/patch?id=1579e9d0d82789 numhash tweak

----------------------------------------
Feature #9425: [PATCH] st: use power-of-two sizes to avoid slow modulo ops
https://bugs.ruby-lang.org/issues/9425#change-45586

* Author: Eric Wong
* Status: Open
* Priority: Low
* Assignee: 
* Category: core
* Target version: current: 2.2.0
----------------------------------------
Prime number-sized hash tables are only needed to compensate for bad
hash functions.  Ruby has good hash functions nowadays, so reduce our
code size with power-of-two-sized hash tables which allows us to avoid
the slow modulo operation.  I expected numhash performance to be worse,
but it seems most of those hashes are either too-small-to-matter or
well-distributed anyways.  If we find problems with some existing
numhashes we should start using a proper hash function (Murmur via
st_hash_uint) on those.

This consistently speeds up the bm_hash_flatten and bm_vm2_bighash.

target 0: trunk (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/rrrr/i/bin/ruby --disable=gems"
target 1: st-noprime (ruby 2.2.0dev (2014-01-17 trunk 44631) [x86_64-linux]) at "/home/ew/ruby/i/bin/ruby --disable=gems"

Benchmarks on a Xeon E3-1230 v3 CPU:

	minimum results in each 10 measurements.
	Execution time (sec)
	name	trunk	st-noprime
	hash_flatten	0.500	0.345
	hash_keys	0.191	0.192
	hash_shift	0.019	0.018
	hash_values	0.201	0.200
	loop_whileloop2	0.090	0.090
	vm2_bighash*	4.457	3.578

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name	st-noprime
	hash_flatten	1.451
	hash_keys	0.998
	hash_shift	1.046
	hash_values	1.003
	loop_whileloop2	1.000
	vm2_bighash*	1.246

Somewhat less impressive on an AMD FX 8320:

	minimum results in each 10 measurements.
	Execution time (sec)
	name    trunk   st-noprime
	hash_flatten    0.633   0.596
	hash_keys       0.236   0.232
	hash_shift      0.031   0.032
	hash_values     0.234   0.238
	loop_whileloop2 0.135   0.135
	vm2_bighash*    8.198   6.982

	Speedup ratio: compare with the result of `trunk' (greater is better)
	name    st-noprime
	hash_flatten    1.063
	hash_keys       1.020
	hash_shift      0.976
	hash_values     0.982
	loop_whileloop2 1.000
	vm2_bighash*    1.174

----------------------------------------------------------------
The following changes since commit 2c3522c3e403dfdadaaf6095564bde364cc4bddf:

  test_thread.rb: stop at once (2014-01-16 06:34:47 +0000)

are available in the git repository at:

  git://80x24.org/ruby.git st-noprime

for you to fetch changes up to ed4f4103f4f407ed99dd6cd25b6c35d3aa9f3479:

  st: use power-of-two sizes to avoid slow modulo ops (2014-01-18 04:05:21 +0000)

----------------------------------------------------------------
Eric Wong (1):
      st: use power-of-two sizes to avoid slow modulo ops

 st.c | 100 +++++++++++++++++--------------------------------------------------
 1 file changed, 25 insertions(+), 75 deletions(-)


---Files--------------------------------
0001-st-use-power-of-two-sizes-to-avoid-slow-modulo-ops.patch (10.9 KB)


-- 
http://bugs.ruby-lang.org/