On Sun, Sep 11, 2011 at 5:15 AM, Brian Candler <b.candler / pobox.com> wrote:
> Phillip Gawlowski wrote in post #1021166:
>> On Sat, Sep 10, 2011 at 6:52 PM, Brian Candler <b.candler / pobox.com>
>> wrote:
>>> irb(main):005:0> f2.write("gro")
>>> =3D> 5
>>>
>>> Normally, transcoding a UTF-8 string (which contains non-ASCII
>>> characters) to ASCII-8BIT would raise an exception:
>>
>> You obviously aren't aware what Extended ASCII (aka ASCII 8bit) is.
>
> I'm not? Then please explain it to me. What I know so far about ruby 1.9
> encoding I have documented at
> https://github.com/candlerb/string19/blob/master/string19.rb

irb(main):001:0> u=3DEncoding::BINARY
=3D> #<Encoding:ASCII-8BIT>
irb(main):002:0> u.names
=3D> ["ASCII-8BIT", "BINARY"]

In other words: binary and ASCII-8BIT are the same.

I find the behavior totally consistent: when writing a String with
*any* encoding to BINARY then simply bytes are dumped as is and _no
conversion_ is done.  So there cannot be an encoding error.

irb(main):003:0> s=3D"gro=DF"
=3D> "gro=DF"
irb(main):004:0> s.size
=3D> 4
irb(main):005:0> s.bytesize
=3D> 5
irb(main):006:0> s.encoding
=3D> #<Encoding:UTF-8>
irb(main):007:0> File.open("x","wb"){|io| p io.external_encoding,
io.internal_encoding ; io.write(s)}
#<Encoding:ASCII-8BIT>
nil
=3D> 5
irb(main):008:0> File.stat("x").size
=3D> 5

Now I can read the file with encoding UTF-8 properly:

irb(main):009:0> t =3D File.open("x","r:UTF-8") {|io| io.read}
=3D> "gro=DF"
irb(main):010:0> t.size
=3D> 4
irb(main):011:0> t.bytesize
=3D> 5
irb(main):012:0> t.encoding
=3D> #<Encoding:UTF-8>
irb(main):013:0> s =3D=3D t
=3D> true

> Now, if you open a file for write and set an external encoding (the
> default is nil) it means "transcode to this encoding". But for some
> reason, setting external encoding to ASCII-8BIT bypasses this rule.

See above.  The name ASCII-8BIT is probably not the best one to choose
for this but if you think about it as BINARY then everything fits
nicely together.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/