On Sunday, October 6, 2002, at 08:45 PM, David Garamond wrote:

>  $ perl -0 -ne'print 1 if /A.+\n+B/s' str   # 0m1.682s
>  $ ruby -0 -ne'print 1 if /A.+\n+B/m' str   # 0m12.468s
>  $ perl -0 -ne'print 1 if /A.+\n.+\nB/s' str   # 0m1.368s
>  $ ruby -0 -ne'print 1 if /A.+\n.+\nB/m' str   # 0m11.427s
> ps: using ruby 1.6.7 vs perl 5.8.0 on linux

So, I'm using ruby 1.6.7 and perl 5.6.0 on mac os x and I found 
something strange. Perl flies until you add parenthesis. Then they 
roughly tie in performance.

	<531> time perl -0 -ne 'print 1 if /A.+\n+B/s' str1     # real    
0m01.568s
	<532> time perl -0 -ne 'print 1 if /A(.+)(\n+)B/s' str  # real    
0m10.763s
	<534> time ruby -0 -ne 'print 1 if /A.+\n+B/m' str      # real    
0m07.856s
	<535> time ruby -0 -ne 'print 1 if /A(.+)(\n+)B/m' str  # real    
0m10.532s

The reason why I was doing this was to see that they were indeed 
matching on the same results and not performing different amounts of 
backtracking and whatnot (see below). They are indeed the same. It 
looks like perl has some extra enhancements in place for when there are 
no parenthesis.

Many times in regexen like these, you don't need or want greedy 
matches. Non-greedy is much much faster in many cases

	<537> time ruby -0 -ne'puts "#{$1.length}, #{$2.length}" if 
/A(.+?)(\n+?)B/m' str
	2, 10000
	
	real    0m0.119s
	user    0m0.050s
	sys     0m0.020s
	<538> time perl -0 -ne  'if (/A(.+?)(\n+?)B/s) { $a = length($1); $b = 
length($2); print "$a, $b\n"}' str
	2, 10000
	
	real    0m0.020s
	user    0m0.000s
	sys     0m0.010s

Perl still beats us, but at .1 seconds, who cares? :)