On Wed, 08 Dec 2004 03:15:10 +0900, Florian Frank wrote:

> Jonathan Paisley wrote:
> 
>>I did some benchmark tests of all four implementations (my original, your
>>two, and the above). The last is about twice as fast as each_with_index
>>and inject, and at least an order of magnitude faster than my original
>>method involving flattening.
>>  
> Try this implementation, too:
> 
> hash = {}
> keys.size.times { |i| hash[ keys[i] ] = values[i] }

Below are some benchmark results from four different implementations 
(excluding the *zip.flatten approach, since it's way too slow). The
key and value arrays just contain integers.

The second set of results has GC disabled, and the GC time after 
each run is included. It's interesting that the times are so different:
your size.times approach is under 3.5 seconds either way, but the
other approaches are very much slower when GC is enabled. I assume this
means that the other three are causing a lot of temporary objects to
be created.

** I suppose these objects could be the pair argument arrays for each of
the block invocations. Can anybody confirm this or suggest otherwise?

I realise now that my original tests (where I said that the 
zip-with-block solution was twice as fast as each_with_index and 
inject) were flawed since I wasn't using a large enough data set.

                          user     system      total        real
each_with_index      28.090000   0.110000  28.200000 ( 28.794794)
inject               27.180000   0.040000  27.220000 ( 27.742072)
zip with block       27.610000   0.030000  27.640000 ( 28.192968)
size.times            3.270000   0.060000   3.330000 (  3.381180)


                          user     system      total        real
each_with_index       4.040000   0.260000   4.300000 (  4.421269)
each_with_index(GC)   0.720000   0.000000   0.720000 (  0.767091)
inject                4.470000   0.090000   4.560000 (  4.760866)
inject(GC)            1.150000   0.010000   1.160000 (  1.157983)
zip with block        3.500000   0.000000   3.500000 (  3.630590)
zip with block(GC)    1.130000   0.000000   1.130000 (  1.136200)
size.times            2.920000   0.010000   2.930000 (  3.017372)
size.times(GC)        0.470000   0.000000   0.470000 (  0.478058)


=================================================================
require 'benchmark'

n = 1000000
keys = (0...n).to_a
values = keys.dup

def Hash.from_pairs_a(keys,values)
  h = {}
  keys.each_with_index {|e,i| h[e] = values[i]}
  h
end

def Hash.from_pairs_b(keys,values)
  Hash[*keys.zip(values).flatten]
end

def Hash.from_pairs_c(keys,values)
  h = {}
  keys.inject(values) do |v,k|
    h[k] = v.shift
    v
  end
  h
end

def Hash.from_pairs_d(keys,values)
  h = {}
  keys.zip(values) do |k,v|
    h[k] = v
  end
  h
end

def Hash.from_pairs_e(keys,values)
  hash = {}
  keys.size.times { |i| hash[ keys[i] ] = values[i] }
  hash
end

$no_gc = ARGV[0]

Benchmark.bm(20) do |x|
  def x.gcreport(label,&block)
    if $no_gc then GC.enable; GC.start; GC.disable; end
    report(label,&block)
    if $no_gc then 
      report(label + "(GC)") { GC.enable; GC.start; GC.disable}
    end
  end
  x.gcreport("each_with_index") { Hash.from_pairs_a(keys,values) }
  #x.report("*zip.flatten") { Hash.from_pairs_b(keys,values) }
  x.gcreport("inject") { Hash.from_pairs_c(keys,values) }
  x.gcreport("zip with block") { Hash.from_pairs_d(keys,values) }
  x.gcreport("size.times") { Hash.from_pairs_e(keys,values) }
end