Phlip wrote: > SpringFlowers AutumnMoon wrote: >> Would a good HTML parser be Hpricot? >> > It's extremely good; try it and see! >> I wonder if anyone knows an easy >> way for it to get all text of an HTML file? (removing all formatting >> tags). >> > > .each_element( './/text()' ){}.join() might do it. anyone knows where to go from: require 'hpricot' doc = Hpricot("<b>hello <i>world</i></b>") and what can i do to get "hello world"? in http://code.whytheluckystiff.net/hpricot/wiki/HpricotChallenge#StripallHTMLtags it says just use str=doc.to_s print str.gsub(/<\/?[^>]*>/, "") but can't the < > be nested in some HTML code? If it is nested then the above won't work, it seems. -- Posted via http://www.ruby-forum.com/.