Da Nedea 12 Februr 2006 19:30 James Edward Gray II napsal:
> On Feb 12, 2006, at 12:08 PM, Marcin Mielʥyʧki wrote:
> > James Edward Gray II wrote:
> >>  >> "<lyrics artist=XXX album=XXX title=XXX> Lalalalala </
> >>
> >> lyrics>".sub(/<(\w+)[^>]+>/, "<\\1>")
> >> => "<lyrics> Lalalalala </lyrics>"
> >
> > reluctant would a bit faster:
> >
> > p "<lyrics artist=XXX album=XXX title=XXX> Lalalalala </
> > lyrics>".gsub(/<(\w+).*?>/, "<\\1>")
>
> Are you sure?
>
> $ ruby regexp_time.rb
> Rehearsal -------------------------------------------------
> /<(w+)[^>]+>/   7.210000   0.030000   7.240000 (  7.266166)
> /<(w+).*?>/     7.710000   0.020000   7.730000 (  7.757304)
> --------------------------------------- total: 14.970000sec
>
>                      user     system      total        real
> /<(w+)[^>]+>/   7.170000   0.030000   7.200000 (  7.227075)
> /<(w+).*?>/     7.730000   0.020000   7.750000 (  7.777196)
> $ cat regexp_time.rb
> #!/usr/local/bin/ruby -w
>
> require "benchmark"
>
> tests = 1000000
> data  = "<lyrics artist=XXX album=XXX title=XXX> Lalalalala </lyrics>"
>
> Benchmark.bmbm do |x|
>    x.report("/<(\w+)[^>]+>/") do
>      tests.times { data.sub(/<(\w+)[^>]+>/, "<\\1>") }
>    end
>    x.report("/<(\w+).*?>/") do
>      tests.times { data.sub(/<(\w+).*?>/, "<\\1>") }
>    end
> end
>
> __END__
>
> ;)
>
> James Edward Gray II

The nongreedy match has to "back up" and retry on every character after the 
tag name, whileas James' [^>] doesn't ever have to back up. In fact, even a 
greedy .* would probably be faster than a nongreedy one in this case.

Gotta love the black art that is optimizing regexps.

David Vallner