Sam Kong wrote:
> Hi, all!
> 
> Quite often, when I need to read a list of web pages, I download the
> html sources and save them in a single file like a.html.
> If they are mostly texts, I open the html using web browser, select all
> and copy it to an editor and save it.
> I want to make the process shorter.
> How can I extract the text from html source?
> I'm sure there're many parsers for it.
> What is the most convenient one?


Take a a look at  Michael Neumann's  WWW::Mechanize

http://www.ntecs.de/blog/Blog/WWW-Mechanize.rdoc
http://rubyforge.org/frs/?group_id=427&release_id=2014

Or install the gem


James

> 
> Thanks.
> Sam
> 
> 
> .
> 


-- 

http://www.ruby-doc.org
http://www.rubyxml.com
http://catapult.rubyforge.com
http://orbjson.rubyforge.com
http://ooo4r.rubyforge.com
http://www.jamesbritt.com