Thanks to all who responded. It looks like I could write some tiny perl scripts with mechanize and pipe their output to a ruby program so that I could do most of the work in ruby instead of perl. Perl reminds me of work, and that is a bad thing. Writing perl also doesn't teach me any ruby, which is an important part of this hack. I also just found mention in the pickaxe book of the open-uri library that will allow me to grab lines from a URL. This, I gather, gives me something roughly equivalent to piping the output of curl into a ruby program. Thanks for the help guys, I think I'm armed with enough to be dangerous now. jp Ryan Leavengood wrote: > On 3/26/06, Jeff Pritchard <jp / jeffpritchard.com> wrote: >> >> I was wondering if anyone could point me to some example code that is >> using RubyfulSoup to parse a sitemap to get links to all the pages on >> that site and request each page and grab things from it. > > WWW::Mechanize makes this easy. The HTML parsing has been pretty > robust in my experience. So far I've used it to scrape my library's > web site to see when books are due and automatically renew them, as > well as log into Cingular.com and get my mobile phone minutes. The > library web-site has weird redirects and some other things that > Mechanize handles great, and the Cingular has a weird multi-step login > system that I got going as well without too much trouble. > > When I needed support for check boxes in the form on the library > web-site, the author of WWW::Mechanize, Michael Neumann, added them in > less than 24 hours. > > So anyhow, this is a slick library, and very useful. > > Ryan -- Posted via http://www.ruby-forum.com/.