On Dec 30, 2011, at 4:59 PM, Ryan Davis wrote:
> The other confusion for you is insisting it is eql? instead of =3D=3D. =
Yong Li nailed the description of how it works. Please read it again. It =
is as close to perfect as we're going to get.
>=20
> Also, as pointed out on the other hash thread... There _needs_ to be a =
1:1 correlation between the result of #=3D=3D and the result of #hash. =
You cannot simply use the "most relevant attribute". You _must_ use =
_all_ the attributes that you use against equality tests. Doing this is =
fundamental to ruby (and computer science) and must be thoroughly =
understood.

I think there are some errors and/or misleading statements in this
discussion.

First of all, the implementation of Hash depends on testing the
equality of two objects via #eql? and not via #=3D=3D.  This is easy
to see by using 1 and 1.0 in a hash:

>> 1.hash #=3D> 3943323080027384908
>> (1.0).hash #=3D> -6757032739833615
>> 1 =3D=3D 1.0 #=3D> true
>> 1.eql?(1.0) #=3D> false
>> h =3D {} #=3D> {}
>> h =3D 'a' #=3D> "a"
>> h[1.0] =3D 'b' #=3D> "b"
>> h #=3D> {1=3D>"a", 1.0=3D>"b"}

If #=3D=3D was being used by Hash, the hash at the end of that sequence
would only have one entry with a key of 1.0.

I don't think it is correct to call the relationship between eql?
and =3D=3D to be one-to-one.

(a =3D=3D b) implies (a.hash =3D=3D b.hash)

but the reverse is not true.

(a.hash =3D=3D b.hash) does not imply (a =3D=3D b)

If two objects have the same hash, they may or may not be equal.
If they aren't equal, you just have a hash collision that has to
be disambiguated by doing a full equality test via eql?.

Finally, there is no hard requirement that a hash implementation
'must use all the attributes' used for the equality test. If there
is a subset of attributes that are generally different for non-equal
objects then the hash function will be more performant if it only
uses the subset of attributes.

The important point is that you don't want your hash function to
create too many collisions where non-equal objects have the same
hash function.

For example:

def hash; 1; end

will 'work' but will cause performance problems when those objects
are stored in a Hash:

require 'benchmark'

class A; end
class B; def hash; 1; end; end

n =3D 10000;

Benchmark.bm(20) do |x|
x.report('Object#hash') { h =3D {}; n.times { |i| h[A.new] =3D i }; }=20=

x.report('1')           { h =3D {}; n.times { |i| h[B.new] =3D i }; }=20=

end

user     system      total        real
Object#hash            0.010000   0.000000   0.010000 (  0.008124)
1                      2.300000   0.000000   2.300000 (  2.311228)

Gary Wright