Hello

I have a lot of xml and java files witch have German Umlauts and other
non ASCII files in them.

I want to read the files and convert them to UTF-8 using a Ruby script.

I convert the strings with this code:

def to_utf8(str)
  str.unpack('U*').map do |c|
    if c < 0x80
      c.chr
    else
      '( u%04X )' % c
    end
  end.join
end

(taken from "The Ruby Way" by Hal Fulton).

sometimes it works, sometimes I get this error:
"malformed UTF-8 character"

I tought this might happen because the File is encoded in ISO-8859-1
(was written with Eclipse set to ISO-8859-1 for text encoding).

how can I read a file with Ruby and specify that it is read with
ISO-8859-1 encoding (similar to Java's BufferedReader where I can
specify the encoding).

any help welcome. best wishes

Claus

-- 
Posted via http://www.ruby-forum.com/.