Hi,

Does anyone know of a fast implementation of the XML escape method  
(the one that converts '"<>& to &quot; etc.)?

I did some benchmarking on one of my applications and the  
implementation I have, which I thought was okay -- simple minded for  
sure, but okay -- turns out to be a bottle neck in certain operations.

I used ruby-prof with a simple test, running over a 400 character  
string 50,000 times or so. Running the profiler on version0 (below)  
took 1.39 seconds.

def version0(input)
   # all kinds of other processing of input simulated by the input.dup
   result = input.dup

   return result
end

The original simple minded way was, under ruby-prof ran in 8.74 seconds:

def version1(input)
   # all kinds of other processing of input simulated by the input.dup
   result = input.dup

   result.gsub!("&", "&amp;")
   result.gsub!("<", "&lt;")
   result.gsub!(">", "&gt;")
   result.gsub!("'", "&apos;")
   result.gsub!("\"", "&quot;")

   return result
end

The best I've come up with so far is, under ruby-prof ran in 3.33:

def version2(input)
   # all kinds of other processing of input simulated by the input.dup
   result = input.dup

   result.gsub!(/[&<>'"]/) do | match |
     case match
     when '&' then return '&amp;'
     when '<' then return '&lt;'
     when '>' then return '&gt;'
     when "'" then return '&apos;'
     when '"' then return '&quote;'
     end
   end

   return result
end

After accounting for overhead, 3.8 times faster is good, I'd like it  
faster still. BTW, gsub is only marginally slower that gsub! I've  
tried using simple iteration, gsub with a hash to avoid the case, and  
variations, all slower to a lot slower than version 1, nothing really  
near version2 (which really was the first variation I tried).

Any ideas?

Cheers,
Bob


----
Bob Hutchison                  -- tumblelog at http:// 
www.recursive.ca/so/
Recursive Design Inc.          -- weblog at http://www.recursive.ca/ 
hutch
http://www.recursive.ca/       -- works on http://www.raconteur.info/ 
cms-for-static-content/home/