On Wed, Apr 23, 2003 at 09:10:28PM +0900, JosSantos Alegria wrote:
> Thanks! I did try it and got just a 10% improvement. The 1.6.8 version is still faster.
> 
> 1. Ruby 1.6.8 under Linux   (fields=line.split(";", -1); n_fields+=fields.length) ==> 57 seconds
> 2. Ruby 1.8.0p2 under Linux (fields=line.split(";", -1); n_fields+=fields.length) ==> 83 seconds
> 3. Ruby 1.6.8 under Linux   (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 59 seconds
> 4. Ruby 1.8.0p2 under Linux (fields=line.split(/;/, -1); n_fields+=fields.length) ==> 65 seconds
> 5. Python 2.2.2 under Linux (fields=line.split(";");     n_fields+=len(fields))   ==> 54 seconds 
> 
> So, your suggestion did improve performance but nevertheless 1.8.0p2 is still slower than 1.6.8 and I believe it shouldn't! Could it be the result of increased garbage collection on the 1.8.0p2 version? The file I'm processing is reasonably large: 400MB with 3.5 million lines! BTW how can I obtain the time spent by Ruby on garbage collection and the number of times it was executed?

IRC some tunings in the GC were modified in 1.8, which could give wildly
different results in programs which rely heavily on GC (like yours).
 
In order to get the time spent collecting, you could build Ruby with
profiling information and profile the interpreter itself.

-- 
 _           _                             
| |__   __ _| |_ ___ _ __ ___   __ _ _ __  
| '_ \ / _` | __/ __| '_ ` _ \ / _` | '_ \ 
| |_) | (_| | |_\__ \ | | | | | (_| | | | |
|_.__/ \__,_|\__|___/_| |_| |_|\__,_|_| |_|
	Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

How do you power off this machine?
	-- Linus, when upgrading linux.cs.helsinki.fi, and after using the machine for several months