Issue #13660 has been reported by Eregon (Benoit Daloze).

----------------------------------------
Bug #13660: rb_str_hash_m discards bits from the hash
https://bugs.ruby-lang.org/issues/13660

* Author: Eregon (Benoit Daloze)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.3.3p222 (2016-11-21 revision 56859) [x64-mingw32]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
I believe rb_str_hash_m might discard some bits from the hash value in some situations.

It computes the hash as a st_index_t, which is either a unsigned long or a unsigned long long.
But the st_index_t value is converted to a VALUE with:
#define ST2FIX(h) LONG2FIX((long)(h))

Note that for instance on x64-mingw32, SIZEOF_LONG is 4, but SIZEOF_LONG_LONG and SIZEOF_VOIDP are 8 bytes.
So that truncates half the bits of the hash on such a platform if my understanding is correct.

Even is SIZEOF_LONG is 8, LONG2FIX loses the MSB I think, given that not all long can fit the Fixnum range on MRI (should it be LONG2NUM?).
Also, I am not sure if it is intended to cast from a unsigned value to a signed value.

I tried many things while debugging the rb_str_hash spec on ruby/spec and eventually gave up.
This computation looks wrong to me in MRI.

For info, here is my debug code:
https://github.com/eregon/rubyspec/blob/d62189450c0a56bfcd379e5e505ad097892d2bc7/optional/capi/string_spec.rb#L501-L518
https://github.com/eregon/rubyspec/blob/d62189450c0a56bfcd379e5e505ad097892d2bc7/optional/capi/ext/string_spec.c#L361-L381
and the build result on AppVeyor:
https://ci.appveyor.com/project/eregon/spec-x948i/build/629



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>