I've got a table that consists of about 2.6 millions rows, each row 
containing 5 columns (a short string, 3 integers, and a decimal 
fraction).

A program needs to read the whole table. At first, programming without 
brain fully engaged, I just grabbed it in Perl using the DBI function 
selectall_arrayref, and then looped through the results.

That uses about 1.3 gig of memory. Perl data structures are not 
efficient when it comes to memory. This approach is reading the whole 
table (which is a mere 130 MB according to MySQL) into a Perl data 
structure (array of rows, each row an array of columns), which I then 
scan to build my data structures.

Recoding to not be so brain dead, instead using prepare and execute, 
cuts the memory usage to about 270 MB. That makes sense, as the table is 
now being loaded into some data structure maintained by the C library 
that implements the Perl DBI driver for mysql.

Out of curiosity, I decided to try this in other languages. Python is 
the same as Perl. 270 MB when I execute the store_result() function to 
grab the data from the database.

Same in C. 270 MB used when I grab the results with mysql_store_result().

This all makes sense, assuming the Perl and Python database drivers are 
using the C mysql library.

So then I tried Ruby:

   require "mysql"
   my = Mysql.new(...)
   st = my.prepare("SELECT * from my_table")
   st.execute
   while row = st.fetch do
      ...
   end

That only uses about 130 MB of memory!?

The above was on a 64-bit machine. On a 32-bit machine, the numbers are 
185 MB for C, Perl, and Python, and 103 MB for Ruby.

How is Ruby able to do so much better on memory usage?

I did not test speed, but I think the Ruby version also ran a little 
faster than the others.


-- 
--Tim Smith