On Friday, September 20, 2002, at 09:22 AM, Tomas Brixi wrote:

> I have written just a simple script to analyze a log file and (just 
> for fun) I have written
> exactly the same in python to see the difference and ...
> python is almost twice the faster doing the same job :-| (???)
> You can see attached files for sources.
> Environment: P4 1.8Ghz, 256MB, WinXP Pro. Python 2.2.1, Ruby 1.7.2-4 - 
> the Pragmatic distribution.
> The analyzed file is about 420 Mbytes and python does it in about 60 
> sec and ruby in about 115
> sec.
> Have some suggestion how to speed the ruby code?

Well... I don't have a WEB.log file as weird as yours so I had to fake 
it. :)

First, I got an error (ruby 1.6.7) with your first regex. Changing it 
to /.../i fixed that. Not sure why... Fixing that and running w/ 
-rprofile shows:

  50.00     0.03      0.03        1    31.25    54.69  IO#each_line
  25.00     0.05      0.02       46     0.34     0.34  String#index
  12.50     0.05      0.01       41     0.19     0.19  String#split
  12.50     0.06      0.01        7     1.12     1.12  IO#write
  ...

There are several things you can do to speed up the ruby code, and 
probably the python code... These are really general ideas:

1) Don't index the whole line in the case of it NOT being a comment.
2) Don't split the whole line if you only want the second field.

I had to tweak my script to work on a different file, splitting on 
colon and such. But here are my changes:

outside the loop:

   comment_re = /^#/

inside the loop:

   next if line =~ comment_re

   if line =~ /[^:]+:([^:]+):/ then
     pom = $1
     if pom =~ r
       outf.write( line + "\n" )

This speeds up my run by more than half for an admittedly small and 
unscientific sample. But the changes will speed up python as well... I 
suggest you look at the "big language shootout" at 
http://www.bagley.org/~doug/shootout/ for some speed differences 
between python and ruby... python is only moderatly faster than perl or 
ruby... they all have their own pros and cons (like ruby's method 
dispatching blows away python and perl, but it's not as fast in 
numerics).