On Wed, Oct 27, 2010 at 2:57 AM, Corey Watts <cwatts / westmont.edu> wrote:
> I still haven't figured this out. =A0Perhaps I should phrase the question
> a different way...
>
> What is the preferred method of extracting the href attribute from a
> link? =A0I've tried doing it using .search() and searching for the xml
> @href attribute. =A0For some reason that's not working for me.
>
> Is there a different way of extracting this attribute, without using
> .search and an xml path? =A0I'm sure mechanize has some other method
> too...

With this and a local version of the page I was able to get the info you wa=
nt:

#!/bin/env ruby19

require 'nokogiri'

raw =3D File.read("restaurants.html", mode: "r:UTF-8")
puts raw.encoding
# raw.force_encoding 'UTF-8'
doc =3D Nokogiri.parse raw

doc.xpath('//div[@class=3D"listing_content"]').each do |listing|
  puts '----------------------------------------'
  # p listing.to_s[0...10]+"..."
  puts listing
  puts '----------------------------------------'
  # p listing.xpath('.//a//text()').map(&:to_s)
  listing.xpath('.//a[@href and contains(text(),"Website")]/@href').each do=
 |a|
    p a.value
  end
  puts
end

Cheers

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/