Hello,

> What is the encoding of your input HTML file?

Opening one of the files in IRB and checking external_encoding.name
returns "UTF-8".

This is from a group of pages I scraped with Hpricot (before switching
to Nokogiri) and saved locally.

The site itself comes from a Microsoft environment and there seems to
be much weirdness in the files. I'll need to anticipate and
accommodate that in my code.

I wonder if I might have better luck building the scraping portion of
my app in a different language (though I'd rather stick with Ruby).