Issue #17837 has been updated by Eregon (Benoit Daloze).


Shouldn't an app have a global timeout per request anyway, and that would catch regexps and other things too?
Capturing the time in regexp interrupt checks is easy but sounds fairly expensive.

Could such regexps emit a warning since they can result in pathological backtracking?
Or is it too difficult to detect them / the problematic patterns evolve over time?

FWIW it's possible to use a non-backtracking regexp engine for many but not all Ruby regular expressions (`--use-truffle-regex` in TruffleRuby).
So that would be one way: warn for those regexps that cannot be implemented without backtracking, but that's probably more restrictive than needed.

----------------------------------------
Feature #17837: Add support for Regexp timeouts
https://bugs.ruby-lang.org/issues/17837#change-91771

* Author: sam.saffron (Sam Saffron)
* Status: Open
* Priority: Normal
----------------------------------------
### Background

ReDoS are a very common security issue. At Discourse we have seen a few through the years. https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS

In a nutshell there are 100s of ways this can happen in production apps, the key is for an attacker (or possibly innocent person) to supply either a problematic Regexp or a bad string to test it with.

```
/A(B|C+)+D/ =~ "A" + "C" * 100 + "X"
```

Having a problem Regexp somewhere in a large app is a universal constant, it will happen as long as you are using Regexps. 


Currently the only feasible way of supplying a consistent safeguard is by using `Thread.raise` and managing all execution. This kind of pattern requires usage of a third party implementation. There are possibly issues with jRuby and Truffle when taking approaches like this.

### Prior art

.NET provides a `MatchTimeout` property per: https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-5.0

Java has nothing built in as far as I can tell: https://stackoverflow.com/questions/910740/cancelling-a-long-running-regex-match

Node has nothing built in as far as I can tell: https://stackoverflow.com/questions/38859506/cancel-regex-match-if-timeout


Golang and Rust uses RE2 which is not vulnerable to DoS by limiting features (available in Ruby RE2 gem)

```
irb(main):003:0> r = RE2::Regexp.new('A(B|C+)+D')
=> #<RE2::Regexp /A(B|C+)+D/>
irb(main):004:0> r.match("A" + "C" * 100 + "X")
=> nil
```

### Proposal

Implement `Regexp.timeout` which allow us to specify a global timeout for all Regexp operations in Ruby. 

Per Regexp would require massive application changes, almost all web apps would do just fine with a 1 second Regexp timeout.

If `timeout` is set to `nil` everything would work as it does today, when set to second a "monitor" thread would track running regexps and time them out according to the global value.

### Alternatives 

I recommend against a "per Regexp" API as this decision is at the application level. You want to apply it to all regular expressions in all the gems you are consuming.

I recommend against a move to RE2 at the moment as way too much would break 


### See also: 

https://people.cs.vt.edu/davisjam/downloads/publications/Davis-Dissertation-2020.pdf
https://levelup.gitconnected.com/the-regular-expression-denial-of-service-redos-cheat-sheet-a78d0ed7d865





-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>