Robert K. wrote in post #988429:
> On 20.03.2011 14:19, Brian Candler wrote:
>>> I'd say it means that the default encoding is used.
>>
>> No, it doesn't.
>
> So, which encoding is used then?

None.

> An encoding *has* to be used because
> you cannot write to a file without a particular encoding.

Untrue. In Unix, read() and write() just work on sequences of bytes, and have no concept of encoding.

Perhaps you are thinking of a language like Python 3, where there is a distinction between "characters" and "bytes representing those characters" (maybe Java has that distinction too, I don't know enough about Java to say)

In ruby 1.9, every String is a bunch of bytes plus an encoding tag. When you write this out to a file,  and the external encoding is nil, then just the bytes are written, and the encoding is ignored.

> I could see in the console that the file was read properly.

What you see in the console in irb does not necessarily mean much in ruby 1.9, because STDOUT.external_encoding is nil by default too.

> irb(main):001:0> File.open("x","w"){|io| p io.external_encoding; io.puts
> "a"}
> nil
> => nil
> irb(main):002:0> s = File.open("x","r:UTF-8"){|io| p
> io.external_encoding; io.read}
> #<Encoding:UTF-8>
> => "a\n"
> irb(main):003:0> s.valid_encoding?
> => true

Now, that's more complex, and *does* show that the data is valid UTF-8. (I wasn't arguing that it wasn't; I was arguing that your logic was flawed, because even if the data were not valid UTF-8, your program would have run without raising an error. Therefore the fact that it runs without error is insufficient to show that the data is valid UTF-8)

[In Java]
> Now suddenly String.length() no longer returns the length in real
> characters (code points) but rather the length in chars.  I figure,
> Ruby's solution might not be so bad after all.

Of course, even in Unicode, the number of code points is not necessarily the same as the number of glyphs or "printable characters".