On 10/17/05, Sam Kong <sam.s.kong / gmail.com> wrote:
>
> Ryan, thanks for the benchmarking.
> Please, add my original solution to your benchmark.
> It seems to be very fast.

Yes it is pretty fast, see below.

> The reason I asked the question was not performance but briefness and
> rubish style.
> The slow one I came up later looks very intuitive and brief but suffers
> from performance.
> Briefness and performance don't always go together.

Absolutely. Many "cute" solutions to problems like this are quite
slow. But keep in mind the benchmarking cases are pretty extreme, and
Ruby is quite fast in all cases except the particularly slow
algorithms.

Here are the latest results, with Park's solution and Sam's first
solution added to the mix:

Small array, many iterations:
                user     system      total        real
david       6.063000   0.015000   6.078000 (  6.094000)
jeff        1.719000   0.000000   1.719000 (  1.719000)
jegII       1.844000   0.000000   1.844000 (  1.860000)
markvh      1.437000   0.000000   1.437000 (  1.438000)
martin      1.750000   0.000000   1.750000 (  1.766000)
paolo       3.469000   0.016000   3.485000 (  3.499000)
park        2.141000   0.000000   2.141000 (  2.157000)
paul        4.016000   0.000000   4.016000 (  4.188000)
robert      3.500000   0.015000   3.515000 (  3.766000)
samk1       2.094000   0.016000   2.110000 (  2.266000)
samk2       5.391000   0.000000   5.391000 (  5.546000)
simonk      1.609000   0.000000   1.609000 (  1.610000)
simons      2.547000   0.000000   2.547000 (  2.578000)
zach        0.344000   0.000000   0.344000 (  0.344000)

Big array, one iteration:
                user     system      total        real
david      35.093000   0.000000  35.093000 ( 35.906000)
jeff        0.110000   0.000000   0.110000 (  0.109000)
jegII       0.000000   0.000000   0.000000 (  0.000000)
markvh      2.172000   0.000000   2.172000 (  2.172000)
martin      0.016000   0.000000   0.016000 (  0.016000)
paolo       0.015000   0.000000   0.015000 (  0.015000)
park        0.015000   0.000000   0.015000 (  0.015000)
paul        0.063000   0.000000   0.063000 (  0.063000)
robert      0.047000   0.000000   0.047000 (  0.047000)
samk1       0.031000   0.000000   0.031000 (  0.031000)
samk2      17.750000   0.000000  17.750000 ( 17.812000)
simonk      4.203000   0.000000   4.203000 (  4.297000)
simons      2.344000   0.000000   2.344000 (  2.500000)
zach        2.156000   0.000000   2.156000 (  2.297000)

Here is the benchmark code, which I've tweaked a bit so that hopefully
each test won't affect another (again, beware of line-wrapping):

require 'benchmark'

def run_benchmark(bx, name, iterations, &block)
  GC.start
  bx.report(name) { iterations.times &block }
end

[[[0,1,2,3,4,5,2,3,2,3], 50000],[(1..5000).to_a + (2000..2500).to_a,
1]].each do |a, iterations|
  Benchmark.bm(10) do |bx|
    run_benchmark(bx, 'david', iterations) { a.uniq.find_all {|x|
a.find_all {|y| y == x }.size > 1 } }
    run_benchmark(bx, 'jeff', iterations) { a.uniq.inject(a.dup) {
|b,i| b.delete_at(b.index(i)) ; b } }
    run_benchmark(bx, 'jegII', iterations) { seen =
Hash.new(0);a.select { |e| (seen[e] += 1) > 1 }.uniq }
    run_benchmark(bx, 'markvh', iterations) { t=[]; a.select{ |e|
r=t.include?e; t<<e; r }.uniq }
    run_benchmark(bx, 'martin', iterations) { h1 = {};h2 = {};a.each
{|i| h2[i] = true if h1[i];h1[i] = true };h2.keys }
    run_benchmark(bx, 'paolo', iterations) { h = {};u = a.inject([])
{|res, x| h[x] ? res + [x] : h[x] = res}.uniq }
    run_benchmark(bx, 'park', iterations) { t = Hash.new(0);
a.each{|e|t[e]+=1};t.delete_if{|k,v|v==1}.keys }
    run_benchmark(bx, 'paul', iterations) { a.sort.join('
').to_s.scan(/(\d+)\s(\1)+/).flatten.uniq }
    run_benchmark(bx, 'robert', iterations) { a.inject(Hash.new(0))
{|h,e| h[e]+=1;h}.select {|k,v|v>1}.map {|x,y|x} }
    run_benchmark(bx, 'samk1', iterations) { h = {};a.each do |i| if
h[i] then h[i] += 1 else h[i] = 1 end;end;u = []; h.each do |k, v| if
v > 1 then u << k end; end; u }
    run_benchmark(bx, 'samk2', iterations) { a.select{|e|
a.grep(e).length > 1}.uniq }
    run_benchmark(bx, 'simonk', iterations) { a.select{|e| a.index(e)
!= a.rindex(e)}.uniq }
    run_benchmark(bx, 'simons', iterations) { uary = a.uniq;
a.map.delete_if{|i| x=uary.member?(i); uary.delete(i); x} }
    run_benchmark(bx, 'zach', iterations) { t=[]; a.delete_if{ |e|
r=(not t.include? e); t.push(e); r }.uniq }
  end
end

Regards,
Ryan