------art_25719_2024293.1190224966082
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On 9/19/07, Chuck Dawit <chuckdawit / gmail.com> wrote:
>
>
>
> I submitted a post a few days ago about scraping the web for Cisco
> products. I didn't receive that much input so I thought I would ask
> again. Here are the requirments. I have a list of 2000 urls that all
> have Cisco in its domain name.
> (ex. http://www.soldbycisco.net
> http://www.ciscoindia.net
> http://www.ciscobootcamp.net
> http://www.cisco-guy.net
>
> and I want to scrape through them and determine which websites are
> selling new cisco products, I'm actually looking for around 20 or so
> products (ex. WIC-1T, NM-4E, WS-G2950-24). One idea I was given was to
> split the pages into ones with forms and those without forms. Those
> without forms probably wont have anything for sale so I can eliminate
> those. But then I really don't know how to handle after that. Does
> anyone have a different/better approach? Any help would be appreciated.
> --
> Posted via http://www.ruby-forum.com/.
>
>
Not to make your problem worse but you will need to differentiate between
new and used equipment too.

-- 
"Hey brother Christian with your high and mighty errand, Your actions speak
so loud, I can't hear a word you're saying."

-Greg Graffin (Bad Religion)

------art_25719_2024293.1190224966082--