Issue #16119 has been updated by dylants (Dylan Thacker-Smith).


duerst (Martin D=FCrst) wrote:
> However, I expect this to be slower on arrays that are 'almost flat', i.e.
> ```
> almost_flat_ary =3D 100.times.to_a + [1, 2]
> ```
> Putting the nested array at the end will make sure all elements of the bi=
g array are checked, only to discover that actual flattening work is needed=
. The time needed for checking will not be offset by the allocation savings=
 on the small array at the end.

The optimization handles this case by doing a quick `ary_memcpy` of everyth=
ing up to the first nested array to avoid redundantly rechecking if those p=
receding elements were an array.

I benchmarked it by replacing `arrays` and `Benchmark.bmbm` in my above bec=
hmark script with =


```ruby
ary =3D 100.times.to_a.push([101, 102])

Benchmark.bmbm do |x|
  report(x, "flatten!") do
    ary.flatten!
  end
  report(x, "flatten") do
    ary.flatten
  end
end
```

and got the following results on master

```
               user     system      total        real
flatten!   0.577976   0.002275   0.580251 (  0.582700)
flatten    0.542454   0.001994   0.544448 (  0.546523)
```

and the following results with this patch

```
               user     system      total        real
flatten!   0.420922   0.001052   0.421974 (  0.422957)
flatten    0.430728   0.001173   0.431901 (  0.433274)
```

so I actually noticed improved performance for arrays with a large flat pre=
fix.

----------------------------------------
Feature #16119: Optimize Array#flatten and flatten! for already flattened a=
rrays
https://bugs.ruby-lang.org/issues/16119#change-80938

* Author: dylants (Dylan Thacker-Smith)
* Status: Open
* Priority: Normal
* Assignee: =

* Target version: =

----------------------------------------
## Problem

When doing an object profile from stackprof, I noticed object allocations b=
eing made from `Array#flatten!` which was unlike other in-place methods lik=
e `Array#uniq!` and `Array#compact!`.  In this case, I wanted to optimize f=
or the array already being flattened and found that `ary.flatten! if ary.an=
y?(Array)` was significantly faster.  The code confirmed my suspicion that =
`ary.flatten!` was almost equivalent to `ary.replace(ary.flatten)` in imple=
mentation. The object allocations I noticed were from a temporary result ar=
ray, a stack array to handle nesting and a hash table to prevent infinite r=
ecursion, all of which can be avoided in the simple case of an already flat=
tened array.

## Solution

Iterate over the array to find the first nested array.  If no nested array =
is found, then `flatten!` just returns `nil` without creating additional ob=
jects and `flatten` returns a shared copy of the original array.  If a nest=
ed array is found, then it creates and initializes the temporary objects to=
 resume with the existing algorithm for flattening the array.

## Benchmark

```ruby
require 'benchmark'

N =3D 100000

def report(x, name)
  x.report(name) do
    N.times do
      yield
    end
  end
end

arrays =3D {
  small_flat_ary: 5.times.to_a,
  large_flat_ary: 100.times.to_a,
  small_pairs_ary: [[1, 2]] * 5,
  large_pairs_ary: [[1, 2]] * 100,
}

Benchmark.bmbm do |x|
  arrays.each do |name, ary|
    report(x, "#{name}.flatten!") do
      ary.flatten!
    end
    report(x, "#{name}.flatten") do
      ary.flatten
    end
  end
end
```

results on the latest master (`ruby 2.7.0dev (2019-08-22T14:10:55Z master f=
d20b32130) [x86_64-darwin18]`)

```
                               user     system      total        real
small_flat_ary.flatten!    0.082001   0.000294   0.082295 (  0.082685)
small_flat_ary.flatten     0.078655   0.000211   0.078866 (  0.079176)
large_flat_ary.flatten!    0.552551   0.001643   0.554194 (  0.556166)
large_flat_ary.flatten     0.551520   0.001327   0.552847 (  0.554091)
small_pairs_ary.flatten!   0.100073   0.000302   0.100375 (  0.100687)
small_pairs_ary.flatten    0.109440   0.000232   0.109672 (  0.109850)
large_pairs_ary.flatten!   1.021200   0.001650   1.022850 (  1.024227)
large_pairs_ary.flatten    1.049046   0.002938   1.051984 (  1.055228)
```

results with the attached patch

```
                               user     system      total        real
small_flat_ary.flatten!    0.034868   0.000150   0.035018 (  0.035180)
small_flat_ary.flatten     0.040504   0.000148   0.040652 (  0.040914)
large_flat_ary.flatten!    0.458273   0.000786   0.459059 (  0.460005)
large_flat_ary.flatten     0.453846   0.000833   0.454679 (  0.455324)
small_pairs_ary.flatten!   0.055211   0.000205   0.055416 (  0.055673)
small_pairs_ary.flatten    0.060157   0.000226   0.060383 (  0.060540)
large_pairs_ary.flatten!   0.901698   0.001650   0.903348 (  0.905246)
large_pairs_ary.flatten    0.917180   0.001370   0.918550 (  0.920008)
```

---Files--------------------------------
0001-Optimize-Array-flatten-and-flatten-for-already-fl.diff.txt (2.79 KB)


-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>