Issue #16237 has been updated by mame (Yusuke Endoh).

Status changed from Open to Closed

This is a spec.  The reason is because `Set#union` duplicates self and then merges the argument, which makes your "set" implementation O( n^2 ).  You may want to use `Set#merge` instead of `Set#union` in this case.

```
def set2
  ret = Set.new
  REPEAT.times do
    ret.merge(Set.new(random_array))
  end
  ret
end
```

```
       user     system      total        real
hash  0.404887   0.011980   0.416867 (  0.416911)
set2  0.468687   0.004019   0.472706 (  0.473242)
```

----------------------------------------
Bug #16237: Set#union performance issue
https://bugs.ruby-lang.org/issues/16237#change-81907

* Author: lionelperrin (Lionel Perrin)
* Status: Closed
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I've discovered a very large difference in performances between Set#union and the use of a hash to compute this union.
The issue can be shown with the code below.

Here are the benchmark results:

```
       user     system      total        real
hash  0.562500   0.078125   0.640625 (  0.639528)
uniq 54.375000   0.312500  54.687500 ( 54.772007)
set 37.312500   0.125000  37.437500 ( 37.474751)
mix1 43.718750   0.000000  43.718750 ( 43.890027)
mix2 37.109375   0.000000  37.109375 ( 37.190839)
```

The code example:

```
require 'benchmark'
require 'set'

REPEAT = 1_000

$DISTINCT_COUNT = 100_000_000

def random_array(size=1_000)
  Array.new(size) { Random.rand(1..$DISTINCT_COUNT) }
end

def uniq
  ret = []
  REPEAT.times do
    ret += random_array
    ret.uniq!
  end
  ret
end

def set
  ret = Set.new
  REPEAT.times do
    ret += Set.new(random_array)
  end
  ret
end

def mix1
  ret = Set.new
  REPEAT.times do
    ret += random_array
  end
  ret
end

def mix2
  ret = Set.new
  REPEAT.times do
    ret += random_array.uniq
  end
  ret
end

def hash
  ret = {}
  REPEAT.times do
    random_array.each { |v| ret[v] = nil }
  end
  ret.keys
end

Benchmark.bm do |x|
  %i(hash uniq set mix1 mix2).each do |i|
    x.report(i, &method(i))
  end
end

```





-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>