Well... yes and no ;-)
I looked back to code you send on Saturday and I noticed I just copied it without changing
separators  : to \t and so it split nothing and run much longer.

Now I have run your script with inline regexen and it runs about 60 secs which is slightly better
than using index method but the code is more clear. 

Tom


--- Ryan Davis <ryand / zenspider.com> wrote:
> 
> On Monday, September 23, 2002, at 12:39 AM, Tomas Brixi wrote:
> 
> > Thanks all for speedup tips.
> >
> > I have tried all of them and the fastest one is attached.
> > Results:
> > ruby : 115 sec -> 62 sec  (wow :-)
> > python : 60 sec -> 53 sec
> 
> Are you sure you included the right script? I just expanded your weblog 
> to 4.5 megs and ran the script you included and got a time of 80 
> seconds (via time). I then added the rest of the changes I suggested 
> and ran my version in 26 seconds (via time). Included below are the 
> scripts and the output. The changes were:
> 
> 1) get rid of all calls to index.
> 2) put all regexen inline.
> 
> > But there could be generally conditions put on more fields. What then?
> > Use String.split to get the fields and then match single fields or 
> > build a all_in_one regex and try to match the whole line?
> 
> That really depends. If you can order the conditions to exclusion in 
> such a way that you can avoid the split, you probably want to go that 
> way. But I'd just measure and see.
> 
> 

> ATTACHMENT part 2 application/octet-stream x-unix-mode=0644; name=parse.rb.orig


> ATTACHMENT part 3 application/octet-stream x-unix-mode=0664; name=parse.rb.time


> ATTACHMENT part 4 application/octet-stream x-unix-mode=0644; name=parse.rb


> ATTACHMENT part 5 application/octet-stream x-unix-mode=0664; name=parse.rb.orig.time



__________________________________________________
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com