Hello.

>I have been posting to the ruby-talk mailing list about ruby memory and GC, and I think it's ready 
>for ruby-core mailing list. The thread topic was "Memory Question", www.ruby-lang.org appears to be 
>  down at the moment otherwise I'd post a links to the mailing list archives.
>
>Here is the latest post from me, afterwards is some input from H. Yamamoto:
>
>Well I did some digging, and I'm left with more questions. The following script does NOT collect the 
>800,000 strings created (this doesn't use Mysql, Rails or anything else, just ruby core). I ran this 
>in accordance with a couple other scripts which utilize 150mb+ of memory, to see if the os would 
>somehow force ruby to do better GC'ing, it didn't and swap was used. On my system this script 
>consistently used 42Mb of memory, and the string count never went below 800,000. Why would the 
>strings not get gc'd when they are out of scope ?

Because ruby's GC is conservative. Python uses reference count based GC, so it is guranteed that unreferenced
object will be freed at end of scope unless reference loops, but conservative GC is... mostly collectable GC.

Of cause, conservative GC has advantage over reference count GC. No need to IncRef or DecRef stuff by hand,
easy to write C extensions. Probably trade-off.

I tried test1.rb, and I got following result.

E:\>e:\ruby\bin\ruby a.rb
108
check now
853447
109
110
110
110
110
110
a.rb:23:in `join': Interrupt
        from a.rb:23

But when I tried my script attached on this mail, array never be freed.
Probably on your environment, array occasionally wasn't freed in test1.rb neigher.

But I think this is not cause of your problem because ... ([ruby-talk:181563])

   Received query results                   89Mb
   String count                             901034
   Threshold breaker String                 (827236) started w/ 73743 ended w/ 900979
   Threshold breaker Table1                 47000 started w/ 0 ended w/ 47000

   Starting GC                              89Mb
   String count                             25041
   Done with GC                             82Mb
   Threshold breaker String                 -48704 started w/ 73743 ended w/ 25039

,.. String count was down. If this memory problem happened by one big array (`records` in test_build_mem_usage?)
which was not freed, String count would not be down.

Can you try similar patch like this?

Index: array.c
===================================================================
RCS file: /src/ruby/array.c,v
retrieving revision 1.137.2.30
diff -u -w -b -p -r1.137.2.30 array.c
--- array.c	22 Dec 2005 07:08:51 -0000	1.137.2.30
+++ array.c	28 Feb 2006 04:12:46 -0000
@@ -2953,6 +2953,16 @@ rb_ary_flatten(ary)
     return ary;
 }
 
+static VALUE
+ary_capacity(VALUE ary)
+{
+    if (FL_TEST(ary, ELTS_SHARED)) {
+	return LONG2FIX(0);
+    }
+    else {
+	return LONG2FIX(RARRAY(ary)->aux.shared);
+    }
+}
 
 /* Arrays are ordered, integer-indexed collections of any object. 
  * Array indexing starts at 0, as in C or Java.  A negative index is 
@@ -3050,6 +3060,8 @@ Init_Array()
     rb_define_method(rb_cArray, "flatten!", rb_ary_flatten_bang, 0);
     rb_define_method(rb_cArray, "nitems", rb_ary_nitems, 0);
 
+    rb_define_method(rb_cArray, "capacity", ary_capacity, 0);
+
     id_cmp = rb_intern("<=>");
     inspect_key = rb_intern("__inspect_key__");
 }
Index: string.c
===================================================================
RCS file: /src/ruby/string.c,v
retrieving revision 1.182.2.44
diff -u -w -b -p -r1.182.2.44 string.c
--- string.c	27 Oct 2005 08:19:20 -0000	1.182.2.44
+++ string.c	28 Feb 2006 04:13:06 -0000
@@ -4613,6 +4613,16 @@ rb_str_setter(val, id, var)
     *var = val;
 }
 
+static VALUE
+str_capacity(VALUE str)
+{
+    if (FL_TEST(str, ELTS_SHARED)) {
+	return LONG2FIX(0);
+    }
+    else {
+	return LONG2FIX(RSTRING(str)->aux.capa);
+    }
+}
 
 /*
  *  A <code>String</code> object holds and manipulates an arbitrary sequence of
@@ -4730,6 +4740,8 @@ Init_String()
 
     rb_define_method(rb_cString, "sum", rb_str_sum, -1);
 
+    rb_define_method(rb_cString, "capacity", str_capacity, 0); /* for debug */
+
     rb_define_global_function("sub", rb_f_sub, -1);
     rb_define_global_function("gsub", rb_f_gsub, -1);
 

... And please embed the code like this into [ruby-talk:181563]'s script in order to get
real memory usage on String/Array. Maybe situation becomes more clear.

def format(num)
  num.to_s.gsub(/(\d{1,3})(?=\d{3}+$)/) { $1 + "," }
end

def count(type)
   count = 0
   capacity = 0
   ObjectSpace.each_object(type) do |o|
     count += 1
     capacity += o.capacity
   end
   puts "#{type}"
   puts "  count = #{format(count)}"
   puts "  capacity = #{format(capacity)}"
end

def pause
   GC.start
   count(String)
   count(Array)
   puts
   sleep 5
end

class A
  def run
    arr = []
    800000.times { arr << "d" * 7 }
    pause
   end
end

pause
A.new.run
loop { pause }