Hugh Sasse wrote:
> On Fri, 16 Sep 2005, Robert Klemme wrote:
>
>> Hugh Sasse wrote:
>>> While my brain is behaving like cottage cheese, it's probably not
>>> the time to ask how one might guarantee that you don't stomp on the
>>> hashes of other ojects in the system.  If you have an even number of
>>> elements, all the same Fixnum, like [1,1,1,1] then they would hash
>>> to 0, as would [2,2], I "think".
>>> irb(main):004:0> [1,1].inject(0) { |a,b| a ^= b.hash}
>>> => 0
>>> irb(main):005:0> [2,1,1,2].inject(0) { |a,b| a ^= b.hash}
>>> => 0
>>
>> Btw, the assignment is superfluous.  The result of a^b.hash is the
>> next iteration's a.
>
> Yes, good point, the result of the block....

Yep.

> ------------------------------------------------------
>       Enumerable#inject enum.inject(initial) {| memo, obj | block }
>       => obj enum.inject          {| memo, obj | block }  => obj
> ------------------------------------------------------------------------
>       Combines the elements of _enum_ by applying the block to an
>       accumulator value (_memo_) and each element in turn. At each
>       step, _memo_ is set to the value returned by the block. The
>       first form ================================================
>          [...]
>>
>>> irb(main):006:0>
>>
>> Yes.  The algorithm can certainly be improved on.  Typically you
>> rather do something similar to
>>
>> (a.hash ^ (b.hash << 3) ^ (c.hash << 7)) & MAX_HASH
>>
>> 09:53:59 [~]: irbs
>>>> [2,1,1,2].inject(0) { |a,b| ((a << 3) ^ b.hash) & 0xFFFF_FFFF}
>> => 2781
> lo>> [1,2, 1,2].inject(0) { |a,b| ((a << 3) ^ b.hash) & 0xFFFF_FFFF}
>> => 1885
>
> Ah, so that's what MAX_HASH is, I couldn't remember how big Fixnums
> were.

No, that's not really the limit - I just choose 32 bit as it's an often
used size. :-)

13:41:59 [~]: ruby -e 'i=1;i<<=1 while Fixnum === i;i>>=1;puts i.to_s(16),
i.class'
20000000
Fixnum

> I was thinking about something like a linear congruential
> random number generator like:
> brains hgs 22 %> irb
> irb(main):001:0> [2,1,1,2].inject(0) {|a,b| ((a * 31)+b.hash) %
> 4093082899} => 151936
> irb(main):002:0> [2,2,1,1].inject(0) {|a,b| ((a * 31)+b.hash) %
> 4093082899} => 153856
> irb(main):003:0> [2,1,2,1].inject(0) {|a,b| ((a * 31)+b.hash) %
> 4093082899} => 151996
> irb(main):004:0>
>
> like
> http://www.cs.bell-labs.com/cm/cs/pearls/markovhash.c
>
> from "Programming Pearls".

That code does only one modulus on the NHASH.  Your code does it for each
iteration.  But otherwise 31 seems a good number - Java containers use 31,
too. :-)

From AbstractList:

    public int hashCode() {
        int hashCode = 1;
        Iterator i = iterator();
        while (i.hasNext()) {
            Object obj = i.next();
            hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
        }
        return hashCode;
    }


> with a largish prime grabbed from
> http://primes.utm.edu/lists/small/small.html#10
>
> being the biggest I could see less than 0XFFFF_FFFF (4294967295)
>
>
> ------------------------------------------------ Class: Fixnum <
>       Integer A +Fixnum+ holds +Integer+ values that can be
>       represented in a native machine word (minus 1 bit). If any
> operation on a +Fixnum+
>
> Should that be 0x7FFF_FFFF?  (2147483647)
> According to
> http://www.rsok.com/~jrm/printprimes.html
> this would seem to be a prime number, so could be used as the
> modulus anyway.

You can even use bit and on this which I would expect to be slightly more
efficient.

Kind regards

    robert