"Daniel Sheppard" <daniels / pronto.com.au> writes: > I was doing some playing with some implementations for Hash#== and ended > up with some very counterintuitive timings. > > I have four equals methods, and I would have expected their performance > to run (from best to worst): > > directInject - stops evaluating == after first bad match, no > intermediate arrays > keysInject - as above, but must create keys array > directCollect - evaluated == for all items all the time, intermediate > array of results > keysCollect - as above, but must create keys array. Your *Inject methods don't actually quit early; they just stop evaluating the second expression once val is false. An explicit break speeds up the quick-exit case considerably. > def directInject(hash1, hash2) > hash1.size == hash2.size && > hash1.inject(true) { |val, kv| val && hash2[kv[0]] == kv[1] } hash1.inject(true) { |val, kv| val or break; hash2[kv[0]] == kv[1] } > end > def keysInject(hash1, hash2) > hash1.size == hash2.size && > hash1.keys.inject(true) { |val, k| val && hash1[k] == hash2[k] } hash1.keys.inject(true) { |val, k| val or break; hash1[k] == hash2[k] } > end Unless you're using a really old ruby, though, Enumerable#all? should outperform all of the above. ---- revised code ---- def keysCollect(hash1, hash2) hash1.size == hash2.size && !(hash1.keys.collect{|k| !hash2[k].nil? && hash1[k] == hash2[k] }.include?(false)) end def directCollect(hash1, hash2) hash1.size == hash2.size && !(hash1.collect{|k,v| hash2[k] == v }.include?(false)) end def directInject(hash1, hash2) hash1.size == hash2.size && hash1.inject(true) { |val, kv| val or break; hash2[kv[0]] == kv[1] } end def keysInject(hash1, hash2) hash1.size == hash2.size && hash1.keys.inject(true) { |val, k| val or break; hash1[k] == hash2[k] } end def using_all hash1, hash2 hash1.size == hash2.size && hash1.all?{|k, v| hash2[k] == v} end def time(p, *args) time = Time.now 1000.times { p.call *args} Time.now - time end hash1 = Hash.new hash2 = Hash.new xx = 'a' 2000.times { hash1[xx] = xx; xx = xx.next } 2000.times { hash2[xx] = xx; xx = xx.next } [:keysCollect, :directCollect, :directInject, :keysInject, :using_all].each { |s| puts puts s puts "equal #{time(method(s), hash1, hash1)}" puts "not equal #{time(method(s), hash1, hash2)}" } ---- output ---- keysCollect equal 3.071011 not equal 1.992086 directCollect equal 3.211872 not equal 3.447227 directInject equal 4.202059 not equal 0.004962 keysInject equal 3.343774 not equal 0.147189 using_all equal 2.84662 not equal 0.003133 ---- Incidentally, if you're using this to compare equality-of-Hash-sans-default-proc, none of these are adequate in the general case since keys may exist that map to the default value. Perhaps: def using_all hash1, hash2 hash1.size == hash2.size && hash1.all?{|k, v| hash2.has_key?(k) && hash2[k] == v} end Or, as mentioned elsewhere, the shorter: {}.update(h1) == {}.update(h2) .... which seems to have better worst-case, but worse best-case performance. HTH