Issue #14759 has been updated by mperham (Mike Perham).


Yusuke, your script doesn't create any memory fragmentation, it throws away everything after 1600 and reads the exact same amount of data each time.  I don't believe this is how Rails apps behave; they fragment over time.  My script creates random sized data and holds onto 10% of the data to create "holes" in the heap and fragment memory quickly.  I believe this better represents normal app conditions.  I've edited your script slightly to randomly keep some data; it better matches the results I posted earlier.  I think changing the IO to read random sizes would also exhibit worse memory:

~~~
$ time MALLOC_ARENA_MAX=2 /ruby/2.5.1/bin/ruby frag2.rb 
VmRSS:	 1620356 kB

real	1m20.755s
user	0m38.057s
sys	1m2.881s

$ time /ruby/2.5.1/bin/ruby frag2.rb 
VmRSS:	 1857284 kB

real	1m19.642s
user	0m36.645s
sys	1m4.480s
~~~

~~~
$ more frag2.rb 
THREAD_COUNT = (ARGV[0] || "10").to_i

File.write("/tmp/tmp.txt", "x" * 1024 * 64)

srand(1234)
Threads = []
Save = []

THREAD_COUNT.times do
  Threads << Thread.new do
    a = []
    100_000.times do
      a = open("/tmp/tmp.txt") {|f| f.read }
      Save << a if rand(100_000) < 1600
    end
  end
end

Threads.each {|th| th.join }
GC.start

IO.foreach("/proc/#{$$}/status") do |line|
  print line if line =~ /VmRSS/
end if RUBY_PLATFORM =~ /linux/
~~~

----------------------------------------
Feature #14759: [PATCH] set M_ARENA_MAX for glibc malloc
https://bugs.ruby-lang.org/issues/14759#change-72188

* Author: normalperson (Eric Wong)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
----------------------------------------
Not everybody benefits from jemalloc and the extra download+install
time is not always worth it.  Lets make the user experience for
glibc malloc users better, too.

Personally, I prefer using M_ARENA_MAX=1 (via MALLOC_ARENA_MAX
env) myself, but there is currently a performance penalty for
that.


gc.c (Init_GC): set M_ARENA_MAX=2 for glibc malloc

glibc malloc creates too many arenas and leads to fragmentation.
Given the existence of the GVL, clamping to two arenas seems
to be a reasonable trade-off for performance and memory usage.

Some users (including myself for several years, now) prefer only
one arena, now, so continue to respect users' wishes when
MALLOC_ARENA_MAX is set.

Thanks to Mike Perham for the reminder [ruby-core:86843]


This doesn't seem to conflict with jemalloc, so it should be safe
for all glibc-using systems.


---Files--------------------------------
0001-gc.c-Init_GC-set-M_ARENA_MAX-2-for-glibc-malloc.patch (1.46 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>