Issue #15797 has been updated by jeremyevans0 (Jeremy Evans).


I was able to run the benchmarks in a Linux virtual machine. This includes =
the benchmarks I added for `File.realdirpath` and for `File.realpath` where=
 the paths that do not exist.  `File.realdirpath` always uses the emulated =
code on Linux, so performance is basically the same for it.  For `File.real=
path`, this results in a 30%-146% performance improvement.

```
Warming up --------------------------------------
              relative_nil    68.629k i/s -     70.070k times in 1.020996s =
(14.57=CE=BCs/i)
              absolute_nil    57.387k i/s -     60.762k times in 1.058804s =
(17.43=CE=BCs/i)
         relative_relative    55.483k i/s -     58.799k times in 1.059775s =
(18.02=CE=BCs/i)
         absolute_relative    57.913k i/s -     61.438k times in 1.060868s =
(17.27=CE=BCs/i)
         relative_absolute    45.609k i/s -     45.756k times in 1.003219s =
(21.93=CE=BCs/i)
          relative_nil_dir    29.184k i/s -     30.960k times in 1.060844s =
(34.26=CE=BCs/i)
          absolute_nil_dir    33.260k i/s -     35.388k times in 1.063982s =
(30.07=CE=BCs/i)
     relative_relative_dir    28.179k i/s -     29.892k times in 1.060801s =
(35.49=CE=BCs/i)
     absolute_relative_dir    32.118k i/s -     33.948k times in 1.056964s =
(31.13=CE=BCs/i)
     relative_absolute_dir    31.790k i/s -     34.008k times in 1.069771s =
(31.46=CE=BCs/i)
     relative_nil_notexist    37.898k i/s -     39.276k times in 1.036371s =
(26.39=CE=BCs/i)
     absolute_nil_notexist    34.093k i/s -     35.460k times in 1.040093s =
(29.33=CE=BCs/i)
relative_relative_notexist    32.874k i/s -     34.464k times in 1.048382s =
(30.42=CE=BCs/i)
absolute_relative_notexist    33.897k i/s -     35.736k times in 1.054244s =
(29.50=CE=BCs/i)
relative_absolute_notexist    28.660k i/s -     30.480k times in 1.063515s =
(34.89=CE=BCs/i)
Calculating -------------------------------------
                             new/ruby    old/ruby
              relative_nil    69.788k     28.397k i/s -    205.887k times i=
n 2.950191s 7.250217s
              absolute_nil    58.143k     32.227k i/s -    172.162k times i=
n 2.961032s 5.342120s
         relative_relative    55.612k     27.348k i/s -    166.447k times i=
n 2.992992s 6.086198s
         absolute_relative    58.391k     30.164k i/s -    173.738k times i=
n 2.975413s 5.759725s
         relative_absolute    45.409k     30.837k i/s -    136.827k times i=
n 3.013203s 4.437175s
          relative_nil_dir    29.215k     29.319k i/s -     87.552k times i=
n 2.996856s 2.986143s
          absolute_nil_dir    32.813k     33.347k i/s -     99.779k times i=
n 3.040857s 2.992129s
     relative_relative_dir    28.121k     28.421k i/s -     84.536k times i=
n 3.006131s 2.974449s
     absolute_relative_dir    32.306k     32.606k i/s -     96.355k times i=
n 2.982614s 2.955131s
     relative_absolute_dir    31.942k     31.896k i/s -     95.369k times i=
n 2.985704s 2.989975s
     relative_nil_notexist    38.438k     20.326k i/s -    113.692k times i=
n 2.957835s 5.593418s
     absolute_nil_notexist    34.183k     22.248k i/s -    102.279k times i=
n 2.992088s 4.597170s
relative_relative_notexist    32.986k     19.736k i/s -     98.620k times i=
n 2.989738s 4.996882s
absolute_relative_notexist    33.944k     22.027k i/s -    101.691k times i=
n 2.995879s 4.616749s
relative_absolute_notexist    28.781k     21.903k i/s -     85.979k times i=
n 2.987334s 3.925385s

Comparison:
                           relative_nil
                  new/ruby:     69787.7 i/s
                  old/ruby:     28397.4 i/s - 2.46x  slower

                           absolute_nil
                  new/ruby:     58142.6 i/s
                  old/ruby:     32227.3 i/s - 1.80x  slower

                      relative_relative
                  new/ruby:     55612.2 i/s
                  old/ruby:     27348.3 i/s - 2.03x  slower

                      absolute_relative
                  new/ruby:     58391.2 i/s
                  old/ruby:     30164.3 i/s - 1.94x  slower

                      relative_absolute
                  new/ruby:     45409.2 i/s
                  old/ruby:     30836.5 i/s - 1.47x  slower

                       relative_nil_dir
                  old/ruby:     29319.4 i/s
                  new/ruby:     29214.6 i/s - 1.00x  slower

                       absolute_nil_dir
                  old/ruby:     33347.2 i/s
                  new/ruby:     32812.8 i/s - 1.02x  slower

                  relative_relative_dir
                  old/ruby:     28420.7 i/s
                  new/ruby:     28121.2 i/s - 1.01x  slower

                  absolute_relative_dir
                  old/ruby:     32606.0 i/s
                  new/ruby:     32305.6 i/s - 1.01x  slower

                  relative_absolute_dir
                  new/ruby:     31941.9 i/s
                  old/ruby:     31896.2 i/s - 1.00x  slower

                  relative_nil_notexist
                  new/ruby:     38437.6 i/s
                  old/ruby:     20326.0 i/s - 1.89x  slower

                  absolute_nil_notexist
                  new/ruby:     34183.2 i/s
                  old/ruby:     22248.3 i/s - 1.54x  slower

             relative_relative_notexist
                  new/ruby:     32986.2 i/s
                  old/ruby:     19736.3 i/s - 1.67x  slower

             absolute_relative_notexist
                  new/ruby:     33943.6 i/s
                  old/ruby:     22026.5 i/s - 1.54x  slower

             relative_absolute_notexist
                  new/ruby:     28781.2 i/s
                  old/ruby:     21903.3 i/s - 1.31x  slower
```

----------------------------------------
Feature #15797: Use realpath(3) instead of custom realpath implementation i=
f available
https://bugs.ruby-lang.org/issues/15797#change-78268

* Author: jeremyevans0 (Jeremy Evans)
* Status: Open
* Priority: Normal
* Assignee: =

* Target version: =

----------------------------------------
One reason to do this is simplicity, as this approach is ~30 lines of
code instead of ~200.

Performance wise, this performs 25%-115% better, using the following
benchmark on OpenBSD 6.5:

```ruby
require 'benchmark'

f =3D File
pwd =3D Dir.pwd
Dir.mkdir('b') unless f.directory?('b')
f.write('b/a', '') unless f.file?('b/a')

args =3D [
  ["b/a", nil],
  ["#{pwd}/b/a", nil],
  ['a', 'b'],
  ["#{pwd}/b/a", 'b'],
  ["b/a", pwd]
]

args.each do |path, base|
  print "File.realpath(#{path.inspect}, #{base.inspect}): ".ljust(50)
  puts Benchmark.measure{100000.times{f.realpath(path, base)}}
end
```

Before:

```
File.realpath("b/a", nil):                          4.330000   2.990000   7=
.320000 (  7.316244)
File.realpath("/home/testr/ruby/b/a", nil):         3.560000   2.680000   6=
.240000 (  6.240951)
File.realpath("a", "b"):                            4.370000   3.080000   7=
.450000 (  7.452511)
File.realpath("/home/testr/ruby/b/a", "b"):         3.730000   2.640000   6=
.370000 (  6.371979)
File.realpath("b/a", "/home/testr/ruby"):           3.590000   2.630000   6=
.220000 (  6.226824)
```

After:

```
File.realpath("b/a", nil):                          1.370000   2.030000   3=
.400000 (  3.400775)
File.realpath("/home/testr/ruby/b/a", nil):         1.260000   2.770000   4=
.030000 (  4.024957)
File.realpath("a", "b"):                            2.090000   1.990000   4=
.080000 (  4.080284)
File.realpath("/home/testr/ruby/b/a", "b"):         1.400000   2.620000   4=
.020000 (  4.015505)
File.realpath("b/a", "/home/testr/ruby"):           2.150000   2.760000   4=
.910000 (  4.910634)
```

If someone could benchmark before/after with this patch on Linux and/or Mac=
OS X,
and post the results here, I would appreciate it.

My personal reason for wanting this is that the custom realpath
implementation does not work with OpenBSD's unveil(2) system call,
which limits access to the file system, allowing for security
similar to chroot(2), without most of the downsides.

This change passes all tests except for one assertion related to
taintedness.  Previously, if either argument to `File.realpath` is an
absolute path, then the returned value is considered not tainted.
However, I believe that behavior to be incorrect, because if there is
a symlink anywhere in the path, the returned value can contain a
section that was taken from the file system (unreliable source) that
was not marked as untainted. Example:

```ruby
Dir.mkdir('b') unless File.directory?('b')
File.write('b/a', '') unless File.file?('b/a')
File.symlink('b', 'c') unless File.symlink?('c')
path =3D File.realpath('c/a'.untaint, Dir.pwd.untaint)
path # "/home/testr/ruby/b/a"
path.tainted? # should be true, as 'b' comes from file system
```

I believe it is safer to always mark the output of realpath as tainted
to prevent this issue, which is what this commit does.

---Files--------------------------------
use-native-realpath.patch (6.31 KB)
use-native-realpath-v2.patch (4.64 KB)
use-native-realpath-v3.patch (5.18 KB)
use-native-realpath-v4.patch (6.34 KB)


-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>