On Mon, Mar 03, 2003 at 06:13:38AM +0900, Mathieu Bouchard wrote:
> 
> On Sat, 1 Mar 2003, Mauricio Fern?ndez wrote:
> 
> > I am using the Jenkins' implementation from Perl 5.8. Moreover, I got
> > rid of the modulus operations (and the prime numbers) by using bitwise
> > AND and ensuring that num_bins is 2^n - 1 (this implies that I've had
> > to add 1 in a number of places, but I believe it is more important to
> > make FIND_ENTRY as fast as possible).
> 
> The point of modulo-big-prime is that you minimize the chance that a bad
> hashfunction maps to a limited number of slots. With 2**n slots you
> _almost_ _maximize_ it!

I was assuming we would use a good hash function...
 
> 
> >                            10000     30000    100000    300000  500000   1000000
> >      ruby-1.6.8.jenkins    0.0910    0.3198    1.2080   10.0227 7.7709   18.0764
> > ruby-1.6.8.jenkins.orig    0.0907    0.3254    1.2281   10.2327 8.0457   18.3917
> >            ruby.jenkins    0.0857    0.2942    1.3588    4.2214 8.0908   17.9354
> >       ruby.jenkins.orig    0.0902    0.3133    1.4469    4.5248 8.5991   18.7564
> > [BTW it seems 1.8's GC has been improved quite a bit as shown by the
> > great difference for 300000 iterations; great work! :) ]
> 
> Because in the 1.6.8 case the result of 300000 is (significantly!) bigger
> than in the 500000 case, I consider the whole test to be completely bogus.
> You'd have to find why it fails to produce results that make sense.

It's because of the GC.
I got similar speed increases when disabling it.
 
> And THEN you'd have to make the comparison on Symbols, Fixnums, plain
> Objects, Arrays of whatever, etc. Pick at least three different common
> cases to make sure you aren't deoptimising too many things.

We can probably find one way to hash these so that collisions don't grow
too much. I'll check it when I get some time.

-- 
 _           _                             
| |__   __ _| |_ ___ _ __ ___   __ _ _ __  
| '_ \ / _` | __/ __| '_ ` _ \ / _` | '_ \ 
| |_) | (_| | |_\__ \ | | | | | (_| | | | |
|_.__/ \__,_|\__|___/_| |_| |_|\__,_|_| |_|
	Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

* JHM wonders what Joey did to earn "I'd just like to say, for the record,
  that Joey rules."
	-- Seen on #Debian