Hello. >I have been posting to the ruby-talk mailing list about ruby memory and GC, and I think it's ready >for ruby-core mailing list. The thread topic was "Memory Question", www.ruby-lang.org appears to be > down at the moment otherwise I'd post a links to the mailing list archives. > >Here is the latest post from me, afterwards is some input from H. Yamamoto: > >Well I did some digging, and I'm left with more questions. The following script does NOT collect the >800,000 strings created (this doesn't use Mysql, Rails or anything else, just ruby core). I ran this >in accordance with a couple other scripts which utilize 150mb+ of memory, to see if the os would >somehow force ruby to do better GC'ing, it didn't and swap was used. On my system this script >consistently used 42Mb of memory, and the string count never went below 800,000. Why would the >strings not get gc'd when they are out of scope ? Because ruby's GC is conservative. Python uses reference count based GC, so it is guranteed that unreferenced object will be freed at end of scope unless reference loops, but conservative GC is... mostly collectable GC. Of cause, conservative GC has advantage over reference count GC. No need to IncRef or DecRef stuff by hand, easy to write C extensions. Probably trade-off. I tried test1.rb, and I got following result. E:\>e:\ruby\bin\ruby a.rb 108 check now 853447 109 110 110 110 110 110 a.rb:23:in `join': Interrupt from a.rb:23 But when I tried my script attached on this mail, array never be freed. Probably on your environment, array occasionally wasn't freed in test1.rb neigher. But I think this is not cause of your problem because ... ([ruby-talk:181563]) Received query results 89Mb String count 901034 Threshold breaker String (827236) started w/ 73743 ended w/ 900979 Threshold breaker Table1 47000 started w/ 0 ended w/ 47000 Starting GC 89Mb String count 25041 Done with GC 82Mb Threshold breaker String -48704 started w/ 73743 ended w/ 25039 ,.. String count was down. If this memory problem happened by one big array (`records` in test_build_mem_usage?) which was not freed, String count would not be down. Can you try similar patch like this? Index: array.c =================================================================== RCS file: /src/ruby/array.c,v retrieving revision 1.137.2.30 diff -u -w -b -p -r1.137.2.30 array.c --- array.c 22 Dec 2005 07:08:51 -0000 1.137.2.30 +++ array.c 28 Feb 2006 04:12:46 -0000 @@ -2953,6 +2953,16 @@ rb_ary_flatten(ary) return ary; } +static VALUE +ary_capacity(VALUE ary) +{ + if (FL_TEST(ary, ELTS_SHARED)) { + return LONG2FIX(0); + } + else { + return LONG2FIX(RARRAY(ary)->aux.shared); + } +} /* Arrays are ordered, integer-indexed collections of any object. * Array indexing starts at 0, as in C or Java. A negative index is @@ -3050,6 +3060,8 @@ Init_Array() rb_define_method(rb_cArray, "flatten!", rb_ary_flatten_bang, 0); rb_define_method(rb_cArray, "nitems", rb_ary_nitems, 0); + rb_define_method(rb_cArray, "capacity", ary_capacity, 0); + id_cmp = rb_intern("<=>"); inspect_key = rb_intern("__inspect_key__"); } Index: string.c =================================================================== RCS file: /src/ruby/string.c,v retrieving revision 1.182.2.44 diff -u -w -b -p -r1.182.2.44 string.c --- string.c 27 Oct 2005 08:19:20 -0000 1.182.2.44 +++ string.c 28 Feb 2006 04:13:06 -0000 @@ -4613,6 +4613,16 @@ rb_str_setter(val, id, var) *var = val; } +static VALUE +str_capacity(VALUE str) +{ + if (FL_TEST(str, ELTS_SHARED)) { + return LONG2FIX(0); + } + else { + return LONG2FIX(RSTRING(str)->aux.capa); + } +} /* * A <code>String</code> object holds and manipulates an arbitrary sequence of @@ -4730,6 +4740,8 @@ Init_String() rb_define_method(rb_cString, "sum", rb_str_sum, -1); + rb_define_method(rb_cString, "capacity", str_capacity, 0); /* for debug */ + rb_define_global_function("sub", rb_f_sub, -1); rb_define_global_function("gsub", rb_f_gsub, -1); ... And please embed the code like this into [ruby-talk:181563]'s script in order to get real memory usage on String/Array. Maybe situation becomes more clear. def format(num) num.to_s.gsub(/(\d{1,3})(?=\d{3}+$)/) { $1 + "," } end def count(type) count = 0 capacity = 0 ObjectSpace.each_object(type) do |o| count += 1 capacity += o.capacity end puts "#{type}" puts " count = #{format(count)}" puts " capacity = #{format(capacity)}" end def pause GC.start count(String) count(Array) puts sleep 5 end class A def run arr = [] 800000.times { arr << "d" * 7 } pause end end pause A.new.run loop { pause }