You're doing two different things.

> $ cat test.rb
> a = "Der gro\xdfe BilderSauger"

That's a double-quoted string, and so Ruby is doing some translation of 
the contents. A common example is \n meaning "newline"; in this case, 
\xNN means the byte with hex code NN. So when you do each_byte, that's 
what you get, a single byte.

Change the double-quotes to single-quotes and you'll actually get the 
four separate characters.

> But when read from a file:
...
>   l.each_byte {|b| puts b}
...
> 92    <- Here
> 120  <- we
> 100  <- are
> 102  <- as 4 ASCII chars '\xdf'

That proves that the file actually contains the four characters
'\', 'x', 'd', 'f'. If you want further proof, try

    hexdump -C test.in

to take Ruby out of the loop completely.

So there's neither UTF-8 nor ISO-8859-1 in that file, just plain ASCII 
characters.

If you want to turn this into something else, you would have to process 
it. For example:

  l.gsub!(/\\x([0-9a-f]{2})/i) { $1.hex.chr }

  # or in ruby 1.9, if you want to tag the encoding:

  l.gsub!(/\\x([0-9a-f]{2})/i) { $1.hex.chr("ISO-8859-1") }

-- 
Posted via http://www.ruby-forum.com/.