Jan E. wrote in post #1051180:
> The part ".*?" of the regular expression is very inefficient

Is it? Have you measured it?

>, because it
> will at first consume every character until the end of the line and then
> try to find the minimum of characters needed.

Does it? There are many implementations of ruby, which particular one(s) 
are you referring to?

Your argument suggests that

    /(.+?)(.+?)(.+?)(.+?)(.+?)/ =~ "a"*1_000_000

would be extremely inefficient, but actually it runs very fast for me.

So let's demonstrate if you are right or wrong:

require 'benchmark'

LONGSTR = ("a" * 1_000_000).freeze

Benchmark.bmbm do |x|
  x.report("chars") { 1_000_000.times { /aaaaa/ =~ LONGSTR } }
  x.report("non-greedy") { 1_000_000.times { /.*?.*?.*?.*?.*?/ =~ 
LONGSTR } }
end

And the results for me, using ruby 1.8.7 under Mac OSX Lion on a Macbook 
Air i7:

Rehearsal ----------------------------------------------
chars        0.520000   0.000000   0.520000 (  0.516861)
non-greedy   0.510000   0.000000   0.510000 (  0.511089)
------------------------------------- total: 1.030000sec

                 user     system      total        real
chars        0.510000   0.000000   0.510000 (  0.505664)
non-greedy   0.510000   0.000000   0.510000 (  0.511662)

I see no difference there.

> Also you don't need the block version of gsub. You can simple use a
> substitute string and refer to the parenthesized subexpression by \1:

You can, but the block version is often clearer, especially if you are 
doing things like backslash-escaping strings:

# clear
a.gsub(/(.)/) { "\\#{$1}" }

# same result but horrible
a.gsub(/(.)/), "\\\\\\1")

-- 
Posted via http://www.ruby-forum.com/.