This is a multi-part message in MIME format.

------extPart_000_0005_01C3091B.9E68EF70
Content-Type: text/plain;
	charsetso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi all! I've just run an experiment to see how Ruby 1.8.0p2 and Ruby 1.6.8 compared to Python 2.2.2 on Mandrake Linux 9.0. For this I used a 1GB Dell workstation with dual Xeon CPUs and dual SCSI 18GB disks. I tried to test basic sequential text file I/O, string.split(";") and hashing. The test CSV file was reasonably large (around 400MBytes and 3.5 million lines).

The timmings I've got are:

Ruby 1.6.8 on Linux
===============

Test 1 --> elapsed time to read the full file, line by line, and counting the number of lines ==> 23 seconds
Test 2 --> the same as above but now including for each line read
                          fields = line.split(";", -1)
                          n_fields += fields.length
                ==> 57 seconds (delta=34 seconds for the extra string.split stuff)
Test 3 --> the same as Test 2 but now adding an hash table (table=Hash.new(0)) to count all occurrences of different values for fields[5]
                          table[fields[5]] += 1
                ==> 65 seconds (delta=8 seconds for the extra hashing stuff)
        
Ruby 1.8.0p2 also on Linux
=====================

Test 1 -->   7 seconds  (3 times faster than version 1.6.8 !!!)
Test 2 --> 83 seconds (delta=76 secs --> quite slower than version 1.6.8!!!)
Test 3 --> 91 seconds (delta=8 secs --> same speed as version 1.6.8)

First conclusion: 1.8.0p2 is much faster than version 1.6.8 for basic sequential text I/O but string operations like str.split(";", -1) seem to be 50% slower! Any reason for this?

Python 2.2.2 also on Linux
====================

Test 1 --> 8 secs (aprox. the same speed as Ruby 1.8.0p2)
Test 2 --> 62 secs (delta=54 secs, a bit faster than Ruby 1.6.8)
Test 3 --> 73 secs (delta=9 secs --> a tiny bit slower than Ruby 1.8.0p2 or 1.6.8)

So, at least for these very very basic operations Ruby compares well with Python. Any idea why Ruby 1.8.0p2 is slower than version 1.6.8 for such a basic operation like string.split(";", -1) ?

Regards,

J. Alegria

BTW: While Python on Windows Xp (on the same machine) is almost as fast as on Linux the same doesnt't seem true for Ruby. The Windows version is slower than the linux one!

------extPart_000_0005_01C3091B.9E68EF70
Content-Type: text/html;
	charsetso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1141" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial>Hi all! I've just run an experiment to see how Ruby 
1.8.0p2 and Ruby 1.6.8 compared to Python 2.2.2 on Mandrake Linux 9.0. For this 
I used a 1GB Dell workstation with dual Xeon CPUs and dual SCSI 18GB disks. I 
tried to test basic sequential text file I/O, string.split(";") and hashing. The 
test CSV file was reasonably large (around 400MBytes and 3.5 million 
lines).</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>The timmings I've got are:</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>Ruby 1.6.8 on Linux</FONT></DIV>
<DIV><FONT face=Arial>===============</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>Test 1 --&gt; elapsed time to read the full file, line by 
line,&nbsp;and counting the number of lines ==&gt;&nbsp;23 seconds</FONT></DIV>
<DIV><FONT face=Arial>Test 2 --&gt; the same as above but now including for each 
line read</FONT></DIV>
<DIV><FONT 
face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
<FONT>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fields = line.split(";", 
-1)</FONT></FONT></DIV>
<DIV><FONT 
face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
n_fields += fields.length</FONT></DIV>
<DIV><FONT 
face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
==&gt; 57 seconds (delta=34 seconds for the extra string.split 
stuff)</FONT></DIV>
<DIV><FONT face=Arial>Test 3 --&gt; the same as Test 2 but now adding an hash 
table (table=Hash.new(0)) to count all occurrences of different values for 
fields[5]</FONT></DIV>
<DIV><FONT 
face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;table[fields[5]] 
+= 1</FONT></DIV>
<DIV><FONT 
face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
==&gt;&nbsp;65 seconds (delta=8 seconds for the extra hashing 
stuff)</FONT></DIV>
<DIV><FONT 
face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></DIV>
<DIV>
<DIV><FONT face=Arial>Ruby 1.8.0p2 also&nbsp;on Linux</FONT></DIV>
<DIV><FONT face=Arial>=====================</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>Test 1 --&gt;&nbsp;&nbsp; 7 seconds&nbsp; (3 times faster 
than version 1.6.8 !!!)</FONT></DIV>
<DIV><FONT face=Arial>Test 2 --&gt; 83 seconds (delta=76 secs --&gt; quite 
slower than version 1.6.8!!!)</FONT></DIV>
<DIV><FONT face=Arial>Test 3 --&gt; 91 seconds (delta=8 secs --&gt; same speed 
as version 1.6.8)</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>First conclusion: 1.8.0p2 is much faster than version 
1.6.8 for basic sequential text I/O but string operations like str.split(";", 
-1) seem to be 50% slower! Any reason for this?</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>Python 2.2.2 also on Linux</FONT></DIV>
<DIV><FONT face=Arial>====================</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>Test 1 --&gt; 8 secs (aprox. the same speed as Ruby 
1.8.0p2)</FONT></DIV>
<DIV><FONT face=Arial>Test 2 --&gt; 62 secs (delta=54 secs, a bit faster 
than&nbsp;Ruby 1.6.8)</FONT></DIV>
<DIV><FONT face=Arial>Test 3 --&gt; 73 secs (delta=9 secs --&gt; a tiny bit 
slower than Ruby 1.8.0p2 or 1.6.8)</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>So, at least for these very very basic operations Ruby 
compares well with Python. Any idea why Ruby 1.8.0p2 is slower than version 
1.6.8 for such a basic operation like string.split(";", -1) ?</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>Regards,</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>J. Alegria</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial>BTW: While Python on Windows Xp (on the same machine) is 
almost as fast as on Linux the same doesnt't seem&nbsp;true for Ruby. The 
Windows version is slower than the linux one!</FONT></DIV>
<DIV><FONT face=Arial></FONT>&nbsp;</DIV></DIV></BODY></HTML>

------extPart_000_0005_01C3091B.9E68EF70--