I have an html file which is encoded in UTF-8. The file contains the
following text:

It's a wonderful life

now the character code 39 is for aphostrohpe in UTF8. so suppose I got
the 39 out of the text using:

s="It's a wonderful life"

s.gsub(/&#(\d+);/, '\1')

The output is

It39s a wonderful life

So firstly I am having trouble making it

It\39s a wonderful life

Secondly I manually did this in test_utf8.rb:

puts "It\39s a wonderful life"

and ran it

ruby test_utf8.rb > utf8.txt

but by opening it in the open office by setting the encoding to utf-8
the output is

It#9s a wonderful life

So how to correctly parse the collect and convert html character
reference to encoded charcters in utf-8 and then save file?

Thanks.
-- 
Posted via http://www.ruby-forum.com/.