On Thursday 16 April 2009 15:23:05 James Gray wrote:
> On Apr 16, 2009, at 3:15 PM, Brian Candler wrote:
> > James Gray wrote:
> >> But you bundled those files.  You know the encoding much better than
> >> Ruby.  Is it really too much to ask for?
> >>
> >>   html = File.read("my_template.html", external_encoding: "UTF-8")
> >
> > Sure, *if you remember* everywhere this is needed. If you don't, then
> > your program will work fine, and pass all your tests, until you run it
> > somewhere else and it dies.
>
> Well, I definitely don't think this is the first case of that in Ruby
> (or most other languages for that matter).  Heck, fork() isn't cross-
> platform and I love fork().

Indeed, and there are win32-specific things on Windows. Even something as 
simple as pathnames isn't universal unless you always use FIle.join -- or 
better yet, Pathname. How often do you do that, instead of just:

open 'foo/bar.txt'

This is a weak example, now that Windows supports / as well as \ as a 
directory delimiter, but I think I've made my point. Even Java programs have 
platform-specific quirks, and this one is quite avoidable.

Given that most other software on a given system (including Perl) will obey a 
default encoding, unless it has a specific reason to believe otherwise (like a 
byte-order mark), I think it's reasonable for Ruby to do the same. Your 
suggestion to default to UTF8 really only makes sense on English systems 
(where encoding is likely to be set to that anyway) -- and even that doesn't 
save you from having to specify binary for binary files.

For that matter, if you've written all your tests, and they pass on one 
system, and fail on another, your tests are working as designed -- in this 
case, exposing a platform-specific bug, either in your program or the 
interpreter.