On Sat, Mar 01, 2003 at 08:42:40PM +0900, ts wrote:
> >>>>> "M" == Mauricio =?iso-8859-1?Q?Fern=E1ndez?= <Mauricio> writes:
> 
> M> --- ruby.jenkins.orig/gc.c	2003-02-28 17:16:13.000000000 +0100
> M> +++ ruby.jenkins/gc.c	2003-02-28 18:40:11.000000000 +0100
> [...]
> M> --- ruby.jenkins.orig/st.c	2003-02-28 17:16:14.000000000 +0100
> M> +++ ruby.jenkins/st.c	2003-02-28 19:18:15.000000000 +0100
> 
>  Apparently, you have not modified string.c (rb_str_hash()) it's normal ?

I only changed the hash in st.c (and propagated side effects in num_bins
to other files). I thought that strhash() was always being used (and was
wondering about what would happen to a string with '\0' inside...) but
now I see that I was wrong and in fact rb_any_hash is used which itself
calls to rb_str_hash.

The strange thing is that, even though the new hash function was not
being used (and BTW, it turned out it's not Jenkins but "one-at-a-time"),
I got some speed increase in simple benchmarks. I believe it's because
I got rid of the primes and modulus calculations and replaced it with
bitwise and... But then, why was that introduced in the first place if
it is slower? Perhaps it was needed for some of the "worse" hashing
functions, but why would we want to keep them now?

I'm changing string.c and benchmarking again. I will try the
"one-at-a-time" hash and Jenkins (it's possible here as we know the
length of the string) to see what's better. Stay tuned :)

-- 
 _           _                             
| |__   __ _| |_ ___ _ __ ___   __ _ _ __  
| '_ \ / _` | __/ __| '_ ` _ \ / _` | '_ \ 
| |_) | (_| | |_\__ \ | | | | | (_| | | | |
|_.__/ \__,_|\__|___/_| |_| |_|\__,_|_| |_|
	Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Linux: the operating system with a CLUE... Command Line User Environment.
	-- seen in a posting in comp.software.testing