I have been posting to the ruby-talk mailing list about ruby memory and GC, and I think it's ready
for ruby-core mailing list. The thread topic was "Memory Question", www.ruby-lang.org appears to be
down at the moment otherwise I'd post a links to the mailing list archives.
Here is the latest post from me, afterwards is some input from H. Yamamoto:
Well I did some digging, and I'm left with more questions. The following script does NOT collect the
800,000 strings created (this doesn't use Mysql, Rails or anything else, just ruby core). I ran this
in accordance with a couple other scripts which utilize 150mb+ of memory, to see if the os would
somehow force ruby to do better GC'ing, it didn't and swap was used. On my system this script
consistently used 42Mb of memory, and the string count never went below 800,000. Why would the
strings not get gc'd when they are out of scope ?
----------test1.rb----------------
def count_objects_for clazz
c = 0
ObjectSpace.each_object{ |o| c+=1 if o.is_a? clazz }
c
end
class A
def run
arr = []
800000.times { arr << "d" * 7 }
puts "check now"
sleep 10
end
end
puts count_objects_for( String )
A.new.run
puts count_objects_for( String )
GC.start
puts count_objects_for( String )
Thread.new do
loop do
puts count_objects_for( String )
GC.start
sleep 5
end
end.join
-------end test1.rb-------------
Send test, same script as above but slight modification to call Array#clear. Now this actually
collects the Strings. I am specifically removing reference in the below code by clearing the Array
arr. But in the above code I don't reference arr or it's value anywhere outside of the method it is
in, so after that call occurs shouldn't it be fair game for the garbage collector?
-------- test2.rb-----------
def count_objects_for clazz
c = 0
ObjectSpace.each_object{ |o| c+=1 if o.is_a? clazz }
c
end
class A
def run
arr = []
800000.times { arr << "d" * 7 }
puts "check now"
sleep 10
arr.clear
end
end
puts count_objects_for( String )
A.new.run
puts count_objects_for( String )
GC.start
puts count_objects_for( String )
Thread.new do
loop do
puts count_objects_for( String )
GC.start
sleep 5
end
end.join
-------end test1.rb-------------
I modified this second script to run with 8 million string also. When the Strings are GC'd the ruby
processes memory usage goes down, but not as much as I'd think. With 8 million strings I am getting
330+Mb of memory in use, but when I GC them, memory only seems to go 191Mb, where I would think it
would fall back down in the single/double digit's, perhaps 15M or 20M. Any reason why ruby doesn't
let go? Is this a problem? Right now I have ran this with ruby 1.8.3 (2005-06-23) [i486-linux], and
ruby 1.8.4 (2005-12-24) [powerpc-linux] with similar results.
Here is H. Yamamoto's response:
Hello.
Ruby's GC is conservative, so there is no guarantee object is freed even if
object is not reachable from GC's root.
But anyway, probably I found the problem on ELTS_SHARED.
/////////////////////////////////
def pause
GC.start
$stdout.puts "measure memory and hit any key..."
$stdout.flush
$stdin.getc
end
pause
a = Array.new(10000){ "." * 1000 } # huge memory
pause
a.map!{|s| s[-100..-1]} # memory stays large
pause
a.map!{|s| s[-3..-1]} # reduces memory
pause
/////////////////////////////////
This is because rb_str_substr (string.c) 's
else if (len > sizeof(struct RString)/2 &&
beg + len == RSTRING(str)->len && !FL_TEST(str, STR_ASSOC)) {
str2 = rb_str_new3(rb_str_new4(str));
RSTRING(str2)->ptr += RSTRING(str2)->len - len;
RSTRING(str2)->len = len;
}
is executed at
a.map!{|s| s[-100..-1]} # memory stays large
rb_str_new3 generates ELS_SHARED RString which holds original RString.
When original string becomes unreachable, it should be garbage collected.
But ELTS_SHARED substring references it (RString#aux->shared), so not collected
until substring itself becomes unreachable.
I haven't confirmed this is really cause of your problem, but there is possibility
this hidden huge string eats memory. (maybe same thing happens on Array)
---- end response ----
Could anyone enlighten me here, or confirm as H. Yamamoto suspects, that there is a problem?
Thanks,
Zach