Hi,

In message "Re: [ruby-core:18663] Re: Character encodings - a radical suggestion"
    on Wed, 17 Sep 2008 23:09:32 +0900, Matthias Whter <matthias / waechter.wiz.at> writes:

|Is there a complete characterization of this whole problem? It seems
|to be the main reason for sticking to non-UTF-8 character sets in
|Ruby these days, and concluding from what I have read about it, a
|solution could be the addition of missing characters/codepoints to
|Unicode. Why does no-one consider going that way, but instead builds
|a complicated stack of functions for conversions on top level?

Just because it's impossible.  History sucks.  We have mixed up YEN
SIGN and REVERSE SOLIDUS for long time.  They cannot be distinguished
without context information.  Technically 0x5c should mean REVERSE
SOLIDUS, but not always so for humans.

Besides that, Unicode is not a panacea.  Some character set
(e.g. GB18030 for Chinese characters) is even bigger than Unicode.
In fact, GB18030 is a super set of Unicode.

|To some extent, it looks like 'some' people like insisting on the
|status quo as it makes them feel special, swimming upstream the
|Unicode waterfall, retaining on regional locales instead of solving
|the issue. I do explicitly not refer to Ruby or the developers, they
|just accept these special needs more than other computer language
|designers with less sympathy for this anomaly.
|
|Nevertheless, a persisting fix is needed, and I think writing more
|and more clutches for encoding conversion goes the wrong way. This
|might still be needed for legacy file support, but day-to-day work
|should not have to deal with this issue so prominently.

You are free to feel so, but it's us who take up the burden.  Hoever,
we are open for complain about usability, e.g. no for "r:UTF-16LE:UTF-8".

							matz.