Issue #17729 has been reported by nobu (Nobuyoshi Nakada).

----------------------------------------
Bug #17729: Fix infinite loop when parsing RUBYLIB with locale-invalid bytes
https://bugs.ruby-lang.org/issues/17729

* Author: nobu (Nobuyoshi Nakada)
* Status: Open
* Priority: Normal
* Backport: 2.5: REQUIRED, 2.6: REQUIRED, 2.7: REQUIRED, 3.0: REQUIRED
----------------------------------------
https://github.com/ruby/ruby/pull/4281
> `ruby.c` sets up the interpreter `$LOAD_PATH` by parsing a path
> separator-delimited list of paths from the `RUBYLIB` environment
> variable. The parser delegates to the C standard library function
> `mblen` to advance a pointer into the result of `getenv("RUBYLIB")` to
> break up the list by path separators.
> 
> `mblen` is a locale-aware API which is documented to return -1 when it
> encounters an invalid byte sequence for the current LOCALE. When
> invoking the `ruby` CLI with a `RUBYLIB` environment variable containing
> an invalid byte sequence or when Ruby is installed to a path containing
> invalid byte sequences, the interpreter will enter an infinite loop
> during its boot sequence.
> 
> For example, passing in an `\xFF` byte when the locale is set to
> `en_US.UTF-8` will result in `mblen` returning -1, which causes the loop
> in `push_include` to spin infinitely.
> 
> I have also seen this bug expressed as attempting to allocate a `String`
> with a negative length, which seems to imply that if the result of
> `getenv` is prefixed in memory with a NUL byte or UTF-8-invalid bytes
> greater than `\x7F`, the -1 return value of `mblen` results in a buffer
> under read.
> 
> I do not believe this buffer under read to be exploitable because
> depending on the byte sequence, the interpreter will infinite loop or
> the loop will terminate with a negative pointer offset, which when used
> to compute the capacity of an `RString`, will result in an
> `ArgumentError` for a negative capacity.
> 
> The fix is to not treat the result of `getenv` as a locale-encoded
> string. The return values of `getenv` are platform strings whose only
> guarantee is that they are NUL-terminated.
> 
> This fix is applied in `push_include` and the CYGWIN target-specific
> `push_include_cygwin`.
> 
> After this patch is applied, `RUBYLIB` with invalid UTF-8 bytes is
> parsed properly with a UTF-8 locale:
> 
> ```console
> $ env RUBYLIB="$(echo -ne "\xFF")" LOCALE="en_US.UTF-8" LC_ALL="en_US.UTF-8" ./ruby -e 'puts $LOAD_PATH.map(&:inspect)'
> `RubyGems' were not loaded.
> `did_you_mean' was not loaded.
> "\xFF"
> "/usr/local/lib/ruby/site_ruby/3.1.0"
> "/usr/local/lib/ruby/site_ruby/3.1.0/x86_64-darwin19"
> "/usr/local/lib/ruby/site_ruby"
> "/usr/local/lib/ruby/vendor_ruby/3.1.0"
> "/usr/local/lib/ruby/vendor_ruby/3.1.0/x86_64-darwin19"
> "/usr/local/lib/ruby/vendor_ruby"
> "/usr/local/lib/ruby/3.1.0"
> "/usr/local/lib/ruby/3.1.0/x86_64-darwin19"
> ```



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>