Hugh Sasse wrote: > On Fri, 16 Sep 2005, Robert Klemme wrote: > >> Hugh Sasse wrote: >>> While my brain is behaving like cottage cheese, it's probably not >>> the time to ask how one might guarantee that you don't stomp on the >>> hashes of other ojects in the system. If you have an even number of >>> elements, all the same Fixnum, like [1,1,1,1] then they would hash >>> to 0, as would [2,2], I "think". >>> irb(main):004:0> [1,1].inject(0) { |a,b| a ^= b.hash} >>> => 0 >>> irb(main):005:0> [2,1,1,2].inject(0) { |a,b| a ^= b.hash} >>> => 0 >> >> Btw, the assignment is superfluous. The result of a^b.hash is the >> next iteration's a. > > Yes, good point, the result of the block.... Yep. > ------------------------------------------------------ > Enumerable#inject enum.inject(initial) {| memo, obj | block } > => obj enum.inject {| memo, obj | block } => obj > ------------------------------------------------------------------------ > Combines the elements of _enum_ by applying the block to an > accumulator value (_memo_) and each element in turn. At each > step, _memo_ is set to the value returned by the block. The > first form ================================================ > [...] >> >>> irb(main):006:0> >> >> Yes. The algorithm can certainly be improved on. Typically you >> rather do something similar to >> >> (a.hash ^ (b.hash << 3) ^ (c.hash << 7)) & MAX_HASH >> >> 09:53:59 [~]: irbs >>>> [2,1,1,2].inject(0) { |a,b| ((a << 3) ^ b.hash) & 0xFFFF_FFFF} >> => 2781 > lo>> [1,2, 1,2].inject(0) { |a,b| ((a << 3) ^ b.hash) & 0xFFFF_FFFF} >> => 1885 > > Ah, so that's what MAX_HASH is, I couldn't remember how big Fixnums > were. No, that's not really the limit - I just choose 32 bit as it's an often used size. :-) 13:41:59 [~]: ruby -e 'i=1;i<<=1 while Fixnum === i;i>>=1;puts i.to_s(16), i.class' 20000000 Fixnum > I was thinking about something like a linear congruential > random number generator like: > brains hgs 22 %> irb > irb(main):001:0> [2,1,1,2].inject(0) {|a,b| ((a * 31)+b.hash) % > 4093082899} => 151936 > irb(main):002:0> [2,2,1,1].inject(0) {|a,b| ((a * 31)+b.hash) % > 4093082899} => 153856 > irb(main):003:0> [2,1,2,1].inject(0) {|a,b| ((a * 31)+b.hash) % > 4093082899} => 151996 > irb(main):004:0> > > like > http://www.cs.bell-labs.com/cm/cs/pearls/markovhash.c > > from "Programming Pearls". That code does only one modulus on the NHASH. Your code does it for each iteration. But otherwise 31 seems a good number - Java containers use 31, too. :-) From AbstractList: public int hashCode() { int hashCode = 1; Iterator i = iterator(); while (i.hasNext()) { Object obj = i.next(); hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode()); } return hashCode; } > with a largish prime grabbed from > http://primes.utm.edu/lists/small/small.html#10 > > being the biggest I could see less than 0XFFFF_FFFF (4294967295) > > > ------------------------------------------------ Class: Fixnum < > Integer A +Fixnum+ holds +Integer+ values that can be > represented in a native machine word (minus 1 bit). If any > operation on a +Fixnum+ > > Should that be 0x7FFF_FFFF? (2147483647) > According to > http://www.rsok.com/~jrm/printprimes.html > this would seem to be a prime number, so could be used as the > modulus anyway. You can even use bit and on this which I would expect to be slightly more efficient. Kind regards robert