On 7/31/06, Francis Cianfrocca <garbagecat10 / gmail.com> wrote:
> Timothy Goddard wrote:
> > Notice that using MD5 is significantly slower than normal string
> > comparison. This also demonstrates that there are few performance gains
> > between 10KB buffers and 100KB buffers, indicating that somewhere in
> > the 10K range would be a good buffer size for the memory/performance
> > tradeoff.
> >
>
> I notice that MD5-generation is not twice as time-consuming as string
> comparison. In fact, it's only a little more time-consuming, which was
> an interesting surprise until I checked the source code and realized
> that Ruby uses the C reference implementation to compute MD5.
>
> Comparing strings is obviously the better choice for doing one-off
> comparisons that won't be repeated. But for applications like
> cache-management or public email systems, where you're going to be
> comparing many times against the same chunk of bits, it makes more sense
> to store an MD5. That way, subsequent trials only have to compute one
> hash, not two.
>
> Someone upthread suggested using SHA1 instead of MD5 for this purpose. I
> haven't done the comparison in Ruby, but in C implementations, SHA1 is
> just slightly slower than MD5, not enough to matter. And Ruby's SHA1
> implementation is also in C.
>
> --
> Posted via http://www.ruby-forum.com/.
>
>

The choice of CRC32/md5/sha1 is a time/space vs false positive
probability trade-off.

For normal uses, CRC (32bits) + size should be enough. It has a nice
feature that it fits into a doubleword.

The advantage of md5 and sha1 is that they are one-way functions, and
that collisions are hard to find.

So,
if you need that 30% speed gain or that 12 bytes per hash,
and you don't need attack-resistance, and probability 2^-32 is  low enough,
then use crc32.
if you do need attack-resistance, I would choose sha1.