Phillip Gawlowski wrote in post #1021166:
> On Sat, Sep 10, 2011 at 6:52 PM, Brian Candler <b.candler / pobox.com>
> wrote:
>> irb(main):005:0> f2.write("gro")
>> => 5
>>
>> Normally, transcoding a UTF-8 string (which contains non-ASCII
>> characters) to ASCII-8BIT would raise an exception:
>
> You obviously aren't aware what Extended ASCII (aka ASCII 8bit) is.

I'm not? Then please explain it to me. What I know so far about ruby 1.9 encoding I have documented at https://github.com/candlerb/string19/blob/master/string19.rb

There are some 200+ behaviours there, reverse-engineered with tests.

However, I'm quite happy to have gaps in my knowledge filled out.

> Take a look at ISO 8859-1, and check what 11011111 encodes.

What has ISO-8859-1 got to do with this?

*If* the file had an external encoding of ISO-8859-1, then the character "" (two bytes in UTF-8) would have been translated to the single byte 0xDF as it was written out.

But in this example, the external encoding is ASCII-8BIT, which is ruby's encoding for "ASCII characters in the low 128 values, and unknown binary in the high 128 values"

Ruby does not let you transcode UTF-8 to ASCII-8BIT, or back again, if there are any high-value characters in it. You can confirm this easily:

>> s1 = "gro"
=> "gro"
>> s1.encode("ASCII-8BIT")
Encoding::UndefinedConversionError: U+00DF from UTF-8 to ASCII-8BIT
  from (irb):2:in `encode'
  from (irb):2
  from /usr/local/bin/irb192:12:in `<main>'
>> s2 = "gro\xDF".force_encoding("ASCII-8BIT")
=> "gro\xDF"
>> s2.encode("UTF-8")
Encoding::UndefinedConversionError: "\xDF" from ASCII-8BIT to UTF-8
  from (irb):4:in `encode'
  from (irb):4
  from /usr/local/bin/irb192:12:in `<main>'

Now, if you open a file for write and set an external encoding (the default is nil) it means "transcode to this encoding". But for some reason, setting external encoding to ASCII-8BIT bypasses this rule.