On 21-Nov-06, at 5:27 PM, Wes Gamble wrote: > Has anyone done a head to head comparison of Hpricot and Rubyful Soup > (both HTML parsers)? > > If so, would you be willing to comment on which one a) is faster > for an > average sized HTML page and b) preserves the original HTML better. I switched from Rubyful Soup to Hpricot a while ago. The reason was performance on 1000-2000 character html chunks -- I didn't do a benchmark because there just was no need to... Hpricot is *a lot* faster. I have no idea which preserves html better, I'm only using them to find specific bits of the html (e.g. links, images, a few other things). I do not use either to transform the input html, I *always* keep the input as it was. In all cases I have html in a string that I give to the parser, I do know that with Rubyful Soup it was absolutely necessary to dup the string first or you were liable to have changes made to the input string. Cheers, Bob > > Thanks, > Wes > > -- > Posted via http://www.ruby-forum.com/. > ---- Bob Hutchison -- blogs at <http://www.recursive.ca/ hutch/> Recursive Design Inc. -- <http://www.recursive.ca/> Raconteur -- <http://www.raconteur.info/> xampl for Ruby -- <http://rubyforge.org/projects/xampl/>