Da Nedea 12 Februr 2006 19:30 James Edward Gray II napsal: > On Feb 12, 2006, at 12:08 PM, Marcin Miel¥Ê¥·y¥Ê§Ôki wrote: > > James Edward Gray II wrote: > >> >> "<lyrics artist=XXX album=XXX title=XXX> Lalalalala </ > >> > >> lyrics>".sub(/<(\w+)[^>]+>/, "<\\1>") > >> => "<lyrics> Lalalalala </lyrics>" > > > > reluctant would a bit faster: > > > > p "<lyrics artist=XXX album=XXX title=XXX> Lalalalala </ > > lyrics>".gsub(/<(\w+).*?>/, "<\\1>") > > Are you sure? > > $ ruby regexp_time.rb > Rehearsal ------------------------------------------------- > /<(w+)[^>]+>/ 7.210000 0.030000 7.240000 ( 7.266166) > /<(w+).*?>/ 7.710000 0.020000 7.730000 ( 7.757304) > --------------------------------------- total: 14.970000sec > > user system total real > /<(w+)[^>]+>/ 7.170000 0.030000 7.200000 ( 7.227075) > /<(w+).*?>/ 7.730000 0.020000 7.750000 ( 7.777196) > $ cat regexp_time.rb > #!/usr/local/bin/ruby -w > > require "benchmark" > > tests = 1000000 > data = "<lyrics artist=XXX album=XXX title=XXX> Lalalalala </lyrics>" > > Benchmark.bmbm do |x| > x.report("/<(\w+)[^>]+>/") do > tests.times { data.sub(/<(\w+)[^>]+>/, "<\\1>") } > end > x.report("/<(\w+).*?>/") do > tests.times { data.sub(/<(\w+).*?>/, "<\\1>") } > end > end > > __END__ > > ;) > > James Edward Gray II The nongreedy match has to "back up" and retry on every character after the tag name, whileas James' [^>] doesn't ever have to back up. In fact, even a greedy .* would probably be faster than a nongreedy one in this case. Gotta love the black art that is optimizing regexps. David Vallner