Michael Neumann wrote:
> James Britt wrote:
 >> ..
>> Anyway, I though this was a neat enough demo of how easy it is to use 
>> Mechanize that I should share it. How the actual search page ends up 
>> is another matter. Time to go find my Google API key perhaps.
> 
> 
> Would you like to share the code with us? Should I include it as an 
> example into WWW::Mechanize?

Sure.  The live version uses the first pass at the Mechanize hack; the 
runs-at-home version uses the more flexible version I wrote while 
replying to your earlier post.  ("Ruby: Ain't it cool?")

But that code is different from your suggestion (and, I gather, 
implementation) on how else to to this (though in practice it is quite 
similar).

So, yes, if Mechanize adopts a way to pass in a 'watch_for set', and 
then makes them available via 'watches', then the Google scrape code 
might make a good example, even if it never goes 'live' on ruby-doc.org

I'd just need to clean it up to use the most current API.

Note that  root.find_all_recursive {|n| n.name == 'p'} would work as 
well as what I do now; my Para class does nothing more than call 
node.to_s.  The advantage, though, to having parse_html collect nodes on 
the HTML stream parse is that it is faster than re-iterating over the 
node tree every time you want a set of nodes.

My Google search code, then, is a somewhat gratuitous use of 
agent.watch_for_set (it is a good example of "Gee, I wonder if ..."), 
though I could perhaps add something that gives a more practical example 
of collecting nodes as custom classes.

Maybe create a version of Para that exposes the element CSS class and id 
as properties.  Then replace the element CSS class value with one of my 
own to better control the resulting page style.   Or something.


Thanks,

James