I got one of my servers updated and I'm now running Nokogiri without
errors which is great news.
Here is my new code:
-------------------
url = URI.parse("http://www.apartment-directory.info")
res = Net::HTTP.start(url.host, url.port) {|http|
http.get('/connecticut/0')
}
page = Nokogiri::HTML res.body
page.xpath("//tr//td/a").each do |node|
puts node.text
end
-----------------
This returns some of the data that I need but not all of it.
I do not understand this line:
page.xpath("//tr/td")
I know it is supposed to be the path to the data I need but I'm not sure
how I can get to all the data I need from the URL, it seems like some of
the data is between tags that I can't figure out.
This is one record from the webpage in HTML:
-----
<tr bgcolor=white><td valign=top><a
href="/map/22-glenbrook-road-condo-associate/stamford-connecticut-06902-(203)327-4028/14741"
title="Condominium Office Rental and Leasing, Condominiums and
Townhouses, Condominium and Townhouse Rental and Leasing ">22 Glenbrook
Road Condo Associates</a></td><td valign=top>
<a
href="/map/22-glenbrook-road-condo-associate/stamford-connecticut-06902-(203)327-4028/14741"
class=map>Map It!</a> </td><td valign=top>22 Glenbrook
Road</td>
<td valign=top>Stamford, CT 06902</td>
<td valign=top nowrap>(203) 327-4028</td></tr>
-----
I need to be able to get the following information for one record out:
22 Glenbrook Road Condo Associates,22 Glenbrook
Road,Stamford,CT,O6902,(203) 327-4028
I thought that if I configured Nokogiri with:
page.xpath("//tr/td")
..that is would get me inside these table brackets but it's not working.
Can you possibly point out where I'm going wrong?
thanks for the help,
atomic
--
Posted via http://www.ruby-forum.com/.