On Wed, May 16, 2012 at 3:02 PM, Brian Candler <lists / ruby-forum.com> wrote:

> The issue is not so much that radiant is broken, but that ruby 1.9 is.
> It has both a broken philosophy (that strings of bytes should always be
> treated as characters in some dynamic encoding)

I think you are being unfair: 1.9 has to deal with the history and
actually 1.8's was broken because of its weak i18n support.  1.9 tries
to evolve from that basis.  If I set encodings properly on streams and
$stdin, $stderr and $stdout things work just fine.

Also, if 1.9 was really that broken we would be seeing much more
postings with encoding issues here.  But apparently most people get by
with 1.9 pretty well which I would take as data point indicating that
it cannot be completely broken as you suggest.

You can even ignore the fact that internally String stores data as
bytes.  For many applications this is just an implementation detail of
String.  The encoding is typically not dynamic because one normally
does not use #force_encoding which will simply unconditionally
overwrite the encoding.

> and a half-baked
> implementation (e.g. which can't convert accented characters from
> uppercase to lowercase).

Well, that's just lacking completeness in a feature - you could also
call it a "bug".  But that is something different than "broken
philosophy".

> There also remains a total lack of official
> specification or documentation. I speak as someone who has attempted to
> reverse-engineer it.

I get by pretty well with the current situation.  There's also what
James wrote at
http://blog.grayproductions.net/categories/character_encodings

> As for "there is a set of simple rules to make your work on 1.9.3
> painless", what this really is saying is "to make your program run
> reliably under ruby 1.9 you have to do magic incantations W, X, Y and
> Z", none of which was necessary in ruby 1.8. What's worse is that
> without these incantations, your app or library or test suite may run
> just fine on your machine, but crash on someone else's, as illustrated
> by the OP.

You make it sound like witch magic.  But it isn't.  My set of rules is
pretty small:

1. Take care of encodings when opening files (i.e. set encoding).
2. Convert all Strings that need to be compared to the same encoding.
(For that setting Encoding.default_internal is often sufficient).
3. Test (but that's a general rule)

The problems usually occur because file metadata does not contain
encoding information.  So it must come from somewhere else.  And that
process is not generally standardized.  We have encoding information
in HTTP and MIME but that is lost once a file is stored somewhere and
one does not take special measures to keep that information.  But this
is a general issue and cannot be attributed to 1.9.

> With ruby 1.9, writing correct programs and giving them sufficient test
> coverage is more work than it was before. Do you run your test suite
> multiple times with different settings of LC_ALL in the environment? If
> not, you should.

Well, obviously: there is a new feature (encoding) which affects *all*
IO.  1.8 versions didn't have it and would happily treat anything as
proper string even though it wasn't properly encoded.  The fact that
they "just worked" does not mean that they necessarily worked correct.
 1.9 is as easy to use when constantly in the same locale environment.
 If something is to be released into the public then of course more
testing should be done.

Cheers

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/