On Wed, Mar 18, 2009 at 03:08:39AM +0900, _why wrote:
> Please enjoy a succulent, new Hpricot. A bit faster, some Ruby 1.9
> support, and assorted fixes.
> 
>   gem install hpricot --source http://code.whytheluckystiff.net
> 
> It should show up at Rubyforge in a bit.
> 
> I'm sure you're wondering what's the reason for Hpricot updates, in
> the face of heated competition from the Nokogiri and LibXML
> libraries. Remember that Hpricot has no dependencies and is smaller
> than either of those libs. Hpricot uses its own Ragel-based
> parser, so you have the freedom to hack the parser itself, the code
> is dwarven by comparison.
> 
> Best of all, Hpricot has run on JRuby in the past. And I am in the
> process of merging some IronRuby code[1] and porting 0.7 to
> JRuby. This means your code will run on a variety of Ruby platforms
> without alteration. That alone makes it worthwhile, wouldn't you
> agree?
> 
> Clearly, the benchmarks you see on Ruby Inside are skewed to favor
> Nokogiri. They parse XML through Hpricot without using Hpricot.XML(),
> which is not only wrong, but puts XML through needless HTML cleanup
> operations. I am sure that Hpricot 0.7 still fares slower on large
> documents. However, for instance, try testing a large amount of
> small documents (a much more common scenario) with this latest
> version.

Thank you for pointing out my mistakes.  The repository[1] is public in
order to keep myself honest.  Patches are welcome.

> You have to question a benchmark that is entirely based on two XML
> documents. What about HTML fix ups? What about various platforms
> and CPUs? Why not treat Hpricot fairly and use it properly in the
> benchmarks? It reeks of something.

HTML fix ups will be tested as well.  So will CSS searches, XPath
searches, memory usage, and many other things.  As I said[2], these benchmarks
are not complete.  If you're worried about being treated fairly, fork my
repository and write tests.

[1] https://github.com/tenderlove/xml_truth/tree
[2] http://www.rubyinside.com/ruby-xml-performance-benchmarks-1641.html#comment-38293

--
Aaron Patterson
http://tenderlovemaking.com/