Issue #10869 has been updated by Rodrigo Rosenfeld Rosas.


Good to know that you think this could speed up requires :) At least this seems like something feasible in case someone would be interested in giving it a try.

For the ABI case, maybe the easiest would be to separate the database location by ABI version, for example:

~/.mri/abi-xx/cache-database
~/.mri/abi-xy/cache-database

Or it could use the exact Ruby version/patch as the subdirectory name. I believe a first step towards precompiling would be great to allow us to evaluate how faster require could get with such approach. If the results are great then it would worth exploring more efficient solutions for caching format, invalidation rules and how to support most cases. But if we could get an initial version that only took care of the common cases it would already allow us to evaluate how faster we would be able to require files or gems with such approach...

Initially the database could be even some sort of Marshal dump if it makes it easier to implement...

----------------------------------------
Feature #10869: Add support for option to pre-compile Ruby files
https://bugs.ruby-lang.org/issues/10869#change-51568

* Author: Rodrigo Rosenfeld Rosas
* Status: Open
* Priority: Low
* Assignee: 
----------------------------------------
I know this topic is tricky but please bear with me.

Goal: improve performance on files loading to speed up the boot process of some application requiring lots of files/gems.

Background:

Currently most frameworks/gems rely on the autoload feature to allow applications to load faster by lazy loading files as missing constants are referenced.

Autoload behavior may lead to hard-to-understand bugs and I believe this is the main reason why Matz discourages its usage:

https://bugs.ruby-lang.org/issues/5653

I described a bug involving autoload in a real scenario in this comment of this same issue:

https://bugs.ruby-lang.org/issues/5653#note-26

While I agree that autoload should be discouraged I think we should provide an alternative for speeding up application loading.

Overall benchmarks:

I decided to create a simple benchmark in order to measure how much time MRI would take to load 10_000 files containing a hundred methods each:

~~~
10000.times{|j| File.open("test#{j}.rb", 'w'){|f|f.puts "class A#{j}"; 100.times{|i| f.puts "  def m#{i}; end"}; f.puts "end"}}

time ruby -r benchmark -I. -e 'puts Benchmark.realtime{10000.times{|i|require "test#{i}"}}

8.766814350005006

real    0m10.068s
user    0m9.416s
sys     0m0.532s

time cat test*.rb > /dev/null

real    0m0.107s
user    0m0.068s
sys     0m0.040s
~~~

As you can see, most of the time is spent on MRI itself rather than on disk. Using require_relative doesn't make any real difference either.

Suggested solution: Pre-compiled files

I know nothing about MRI internals but I suspect that maybe if MRI could support some sort of database containing a precompiled version of the files (the bytecodes maybe). The database would store the size and a hash for each processed file. If the size and hash remain the same it would assume the bytecodes in the database are up-to-date, which should happen in most cases. In this case those files could be possibly loaded much faster.

In order to avoid additional overhead or some bugs in some cases, maybe an option to enable the pre-compile behavior would be better to allow us to test this approach.

I understand that it may be complicated to precompile all kind of Ruby files as they could execute code as well rather than simply declaring classes. In such cases I still think it would worth to detect such cases and skip pre-compiling for such files and only pre-compile those files containing simple class declarations only, which is the case for a lot of files already. Maybe this could potentially make gem owners move their statements to a separate file in order to allow the classes to be precompiled in the future...

Do you have any other suggestions to speed up application loading that do not involve autoload and conditional requires? Do you think precompilation is possible/worthy on MRI?



-- 
https://bugs.ruby-lang.org/