On Tue, 30 Jan 2007 07:38:20 +0900, Shot (Piotr Szotkowski) wrote:

> I have a Set subclass, Block. I need two blocks to be considered the
> same by Set iff they have the same elements. From what I gathered, Set
> treats two objects as equal when they are eql? and have the same hash.
> 
> Implementing Block#eql? seems to be trivial, as Set#== is there to help:
> 
> class Block < Set
>   def eql? other
>     self == other
>   end
> end
> 
> I°«m stuck when it comes to Block#hash, though; I need these to be true:
> Block.new.hash == Block.new.hash
> Block.new([1,2]).hash == Block.new([1,2]).hash
> 
> What°«s the proper way to go at it? Adding the elements°« hashes together
> is obviously wrong, as four different elements might produce the same
> sum when paired; also, the docs say, °∆any hash value that exceeds the
> capacity of a Fixnum will be truncated before being used,°« so hash can°«t
> exceed the (word - 1 bit) size.
> 
> -- Shot

There's nothing about hash codes that requires that
  a.hash != b.hash if a != b
You want to avoid collisions as much as possible, but
that's an issue of performance, not correctness, so you'd be perfectly
correct in adding up the hash codes of the members to create the has code.

Nor is there anything that requires you to keep your hash code within the
capacity of a Fixnum when you generate it. That's why it's automatically
truncated for you. The only reason you need to know is so that your hash
function doesn't hash things in such a way that the truncated number has
lots of collisions. (Once again a performance issue).

You may want to look at http://www.partow.net/programming/hashfunctions/
which discusses some relatively important hash functions. You may have to
experiment to see what works best. Wikipedia may also have good
information.

--Ken

-- 
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/