For HTML scraping I recommend scrAPI.
gem install scrapi
homepage:
http://blog.labnotes.org/category/scrapi/
Example scraper:
Scraper.define do
attr_accessor :title, :author, :pub_date, :content
process "div#GuardianArticle > h1", :title => :text
process "div#GuardianArticle > font[size=2] > b" do |element|
@author = element.children[0].content
@pub_date = element.children[2].content.strip
end
process "div#GuardianArticleBody", :content => :text
end
--
Posted via http://www.ruby-forum.com/.