Issue #3924 has been updated by Xavier Shay.


Hello,
This is a long message, but I have tried to address most of the concerns with my patch. There are two sections: one for the technical detail of the patch, and one for benchmarks.

= The patch

I have split my patch up into a six smaller patches which I hope are easier to review.

Descriptions: https://gist.github.com/35060fbcefb25cf1a456
Patches:      https://gist.github.com/58dbd6e72c1a1f47a415
GitHub:       https://github.com/xaviershay/ruby/commits/require-patch

In addition I have addressed the following concerns from Yusuke:
- Please use 4 space for indent, with 8 space tab.  (Emacs-style)
- Please use C89.  Please don't use // comments.
- Please don't export function without "rb_" prefix, to avoid symbol
  conflicts.

I think there are still two big concerns:
1) Significantly different from original load.c
2) Compatibility with Windows
3) LoadedFeaturesProxy

Regarding #1, I don't have anything further to add at the moment.
Regarding #2, it will need to be tested thoroughly like all other platforms, but I don't believe it would result in significant architectural changes.

For #3, I was worried about it but on reflection I think it may not be a problem.

Yusuke wrote:
"It is an impossible approach because some extension library can
modify $LOADED_FEATURES directly by rb_ary_push."

The hooks are only used to keep a cache up to date. If we get a cache-miss, we could just revert to scanning $LOADED_FEATURES again which would pick up any items that were added with rb_ary_push.
This will degrade worst-case performance, but I don't think that matters because most requires will succeed.
Does this make sense?

This would also mean the addition of `lib/enumerator.rb` in my patch would be unnecessary (though IMO probably still a good idea).

= Benchmarks

Yusuke I think I mislead you with the benchmarks. I do not expect full_load_path_benchmark.rb to be faster. I was only using it to ensure I didn't make that case any worse. See below for further details.

Here are my updated timings, with descriptions of each. For the load path and require benchmark I have only included the final number in the test to allow for a more focused comparison.

Run on OSX 10.7

	            1.9.2p180	  1.9.3*	  1.9.3** 	1.9.3***
load path	    0.667376	  0.866228	0.623411  0.699315
requires   	  6.745875	  7.921598	7.261326	0.285119
new rails	    2.172	      1.412	    1.624	    1.082
medium rails  18.372	    17.59	    15.855	  10.488

* 1.9.3r31827 (31/5)
** 1.9.3r31827, with r30789 reverted
*** 1.9.3r31827, with my patch

== Benchmark descriptions

### load path - https://gist.github.com/985224

Loading a file 50 times with 2000 entries in the $LOAD_PATH.
This is an academic benchmark, since 2000 entries is a large amount. For comparison, our rails app load path only has 42 entries.

Performance for all versions is linear and roughly equivalent.
Note that I removed the `GC.disable` that was in earlier versions of this benchmark.

### requires - https://gist.github.com/c8d0d422a9203e1fe492

Loading 2500 files. This is a more relavent benchmark. Our Rails app loads ~2200 files, large apps can load as many as 9000 [1].
All versions display an exponential increase in time as N increases. 1.8.7, though not included in this measurement, also displays such an exponential curve but it isn't as noticeable in comparison with 1.9 since the magnifier is far less [2].


[1] sent to me via private correspondence
[2] see https://gist.github.com/c8d0d422a9203e1fe492

### new rails

Using rails 3.0.7:

    rails new test-rails-app
    cd test-rails-app
    time ruby script/rails runner "puts 1"

### medium rails

This is my main Rails code base. Sorry I cannot share it, though this benchmark is not required to make a case for this patch since it shows roughly the same percentage improvement as a blank rails app.


Thanks,
Xavier
----------------------------------------
Bug #3924: Performance bug (in require?)
http://redmine.ruby-lang.org/issues/3924

Author: Carsten Bormann
Status: Open
Priority: Normal
Assignee: 
Category: core
Target version: 1.9.3
ruby -v: -


=begin
 Running irb < /dev/null in 1.9.2 causes 3016 calls to lstat64.
 
 For instance, there is a sequence of 28 repetitions each of lstat calls to all 6 non-empty path prefixes of /opt/local/lib/ruby1.9/1.9.1/irb.rb -- a total of 170 lstats apparently just to load this file; another set of lstats then occurs later for another 18 (times 6) times.  Clearly, something is running amok in the calling sequence rb_require_safe -> realpath_rec -> lstat.
 
 Another example: Running a simple test with the baretest gem causes 17008 calls to lstat.  According to perftools.rb, 80 % of the 1.2 seconds of CPU is used in Kernel#gem_original_require (and another 12 in GC, some of which may be caused by this).
=end



-- 
http://redmine.ruby-lang.org