Issue #15916 has been updated by luke-gru (Luke Gruber).


I managed to track down the leak, and it's related to rb_fstring(). 

reg_set_source() calls rb_fstring() with the tainted string, and there's a leak when non-"bare" (ivars or tainted) non-embedded strings are
given to rb_fstring().

It occurs in str_replace_shared_without_enc(), whos first argument should always be a new string,
not one with a buffer already, as this function doesn't clear any already allocated buffer.


Adding
```
        if (!STR_EMBED_P(str2) && !FL_TEST(str2, STR_SHARED|STR_NOFREE)) {
            ruby_sized_xfree(STR_HEAP_PTR(str2), STR_HEAP_SIZE(str2));
        }

```

to the else case plugs the leak. I don't think this function should be called at all in rb_fstring(),
but that could be a different issue.

----------------------------------------
Bug #15916: Memory leak in Regexp literal interpolation
https://bugs.ruby-lang.org/issues/15916#change-78607

* Author: mltsy (Joe Marty)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-linux]
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
When interpolating a string inside a Regexp literal, if the string contains a multibyte character loaded from a file (not sure if this covers all the cases, but this is what triggers it for me), Ruby leaks memory.

The code below reproduces the problem, while outputting the process memory usage as it rises (get_process_mem gem is required).

Ways to avoid the memory leak (although I don't know why) include:
1. Using the string literal to define `PATTERN` directly (Not loading it from a file)
2. Using `Regexp.new` instead of a literal interpolation (`/#{...}/`)
3. Shortening the string to just a few characters (maybe small enough to fit inside a single RVALUE?)

``` ruby
require 'get_process_mem'

str = "String that doesn't fit into a single RVALUE, with a multibyte char:" + 160.chr(Encoding::UTF_8)
File.write('weirdstring.txt', str)
pattern = File.read("weirdstring.txt")

loop do
  print "Running... "

  100_000.times { /#{pattern}/i }

  puts " process mem: #{GetProcessMem.new.mb.to_i}MB"
end

```

Expected Result:
Constant memory usage (avoiding the leak produces constant memory usage between 10-20MB)

Actual Result:
Continual memory growth (it only takes 60 seconds or so to consume 500MB)



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>