Issue #11148 has been updated by Benoit Daloze.


Aaron Patterson wrote:
> @Benoit yes, for performance, and to avoid catching load errors.  If my plan is successful, rubygems would stop adding directories to the load path.  That means searching *should* be relatively fast (since the load path would be relatively small).  With the current algorithm, the first require that "activates" a gem will always raise an exception, then the gem gets loaded, and all of the requires inside the gem will not raise an exception.  So say 98% of the time, require doesn't raise an exception.  If I stop adding directories to the load path, then 98% of requires *will* raise an exception.  I think that would incur a non-trivial overhead (though I don't have numbers for you right now).

Right, makes sense. It would be great to have some data though :)

How would the cache deal with duplicated keys, that is when multiple gems have a same relative path inside their lib/,etc directories? I think there might be some expectation for some gems on having the gem lib/,etc in $LOAD_PATH.

Is the first find_gem_that_contains_file(file) O(number of installed gems) or is there some heuristic matching the first component of file with a gem name?

----------------------------------------
Feature #11148: Add a way to require files, but not raise an exception when the file isn't found
https://bugs.ruby-lang.org/issues/11148#change-52429

* Author: Aaron Patterson
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
Hi,

I'm trying to make is so that RubyGems doesn't need to put directories on $LOAD_PATH (which is why I submitted Feature #11140).  I would like the `require` implemented in RubyGems to look up the file from a cache generated when the gem is installed, then pass a full file path to `require`.

The problem is that the user may have manipulated the load path somehow, and RubyGems needs to detect if the file is in the load path.  Today, the algorithm inside RubyGems looks something like this:

~~~ruby
def require file
  if file_is_from_a_default_gem?(file) # this is so you can install new versions of default gems
    add_default_gem_to_loadpath
  end
  real_require file
rescue LoadError
  gem = find_gem_that_contains_file(file)
  add_gem_to_loadpath gem
  real_require file
end
~~~

Instead of adding the directory to the load path, I would like to look up the full file path from a cache that is generated when the gem is installed.  If we had a cache, that means the new implementation would look like this:

~~~ruby
def require file
  if file_is_from_a_default_gem?(file) # this is so you can install new versions of default gems
    add_default_gem_to_loadpath
  end
  real_require file # get slower as paths are added to LOAD_PATH
rescue LoadError
  gem = find_gem_that_contains_file(file) # use a cache so lookup is O(1)
  fully_qualified_path = gem.full_path file
  real_require fully_qualified_path # send a fully qualified path, so LOAD_PATH isn't searched
end
~~~

Unfortunately, that means that every call to require in the system would raise an exception.  I'd like to add a version of `require` that we can call that *doesn't* raise an exception.  Then I could write the code like this:

~~~ruby
def require file
  if file_is_from_a_default_gem?(file) # this is so you can install new versions of default gems
    add_default_gem_to_loadpath
  end
  found = try_require file
  if nil == found
    gem = find_gem_that_contains_file(file) # use a cache so lookup is O(1)
    fully_qualified_path = gem.full_path file
    real_require fully_qualified_path # send a fully qualified path, so LOAD_PATH isn't searched
  end
    found
  end
end
~~~

This would keep the load path small, and prevent exceptions from happening during the "normal" case.

I've attached a patch that implements `try_require`, but I'm not set on the name.  Maybe doing `require(file, exception: false)` would work too.

---Files--------------------------------
try_require.patch (2.44 KB)


-- 
https://bugs.ruby-lang.org/