On 02/08/2010 10:18 PM, Tim Smith wrote: > I've got a table that consists of about 2.6 millions rows, each row > containing 5 columns (a short string, 3 integers, and a decimal > fraction). > > A program needs to read the whole table. At first, programming without > brain fully engaged, I just grabbed it in Perl using the DBI function > selectall_arrayref, and then looped through the results. > > That uses about 1.3 gig of memory. Perl data structures are not > efficient when it comes to memory. This approach is reading the whole > table (which is a mere 130 MB according to MySQL) into a Perl data > structure (array of rows, each row an array of columns), which I then > scan to build my data structures. > > Recoding to not be so brain dead, instead using prepare and execute, > cuts the memory usage to about 270 MB. That makes sense, as the table is > now being loaded into some data structure maintained by the C library > that implements the Perl DBI driver for mysql. > > Out of curiosity, I decided to try this in other languages. Python is > the same as Perl. 270 MB when I execute the store_result() function to > grab the data from the database. > > Same in C. 270 MB used when I grab the results with mysql_store_result(). > > This all makes sense, assuming the Perl and Python database drivers are > using the C mysql library. > > So then I tried Ruby: > > require "mysql" > my = Mysql.new(...) > st = my.prepare("SELECT * from my_table") > st.execute > while row = st.fetch do > ... > end > > That only uses about 130 MB of memory!? > > The above was on a 64-bit machine. On a 32-bit machine, the numbers are > 185 MB for C, Perl, and Python, and 103 MB for Ruby. > > How is Ruby able to do so much better on memory usage? > > I did not test speed, but I think the Ruby version also ran a little > faster than the others. It appears that you employ two fundamentally different approaches: the description of your Perl and Python versions sounds as if you load the *whole table* in one data structure and only then iterate or process it. The Ruby code fetches a single row at a time (or at most a few rows if there is caching going on behind the scenes). Well, if my assessment is correct, you have your reason. :-) Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/