James Britt wrote:
> Sam Kong wrote:
> > Hi, all!
> >
> > Quite often, when I need to read a list of web pages, I download
the
> > html sources and save them in a single file like a.html.
> > If they are mostly texts, I open the html using web browser, select
all
> > and copy it to an editor and save it.
> > I want to make the process shorter.
> > How can I extract the text from html source?
> > I'm sure there're many parsers for it.
> > What is the most convenient one?
>
>
> Take a a look at  Michael Neumann's  WWW::Mechanize
>
> http://www.ntecs.de/blog/Blog/WWW-Mechanize.rdoc
> http://rubyforge.org/frs/?group_id=427&release_id=2014
>
> Or install the gem

Thank James.
That looks cool.
However, it doesn't seem to have a function to extract texts from html.
(Or did I miss it?)
What I want is...

<table><tr><td>TEST</td></tr></table> => TEST

Is there a module that does this?

Regards,
Sam

>
>
> James
>
> >
> > Thanks.
> > Sam
> >
> >
> > .
> >
>
>
> --
>
> http://www.ruby-doc.org
> http://www.rubyxml.com
> http://catapult.rubyforge.com
> http://orbjson.rubyforge.com
> http://ooo4r.rubyforge.com
> http://www.jamesbritt.com