On 17/09/2008, Yukihiro Matsumoto <matz / ruby-lang.org> wrote:
> Hi,
>
>  In message "Re: [ruby-core:18663] Re: Character encodings - a radical suggestion"
>
>     on Wed, 17 Sep 2008 23:09:32 +0900, Matthias W├Ąchter <matthias / waechter.wiz.at> writes:
>
>  |Is there a complete characterization of this whole problem? It seems
>  |to be the main reason for sticking to non-UTF-8 character sets in
>  |Ruby these days, and concluding from what I have read about it, a
>  |solution could be the addition of missing characters/codepoints to
>  |Unicode. Why does no-one consider going that way, but instead builds
>  |a complicated stack of functions for conversions on top level?
>
>
> Just because it's impossible.  History sucks.  We have mixed up YEN
>  SIGN and REVERSE SOLIDUS for long time.  They cannot be distinguished
>  without context information.  Technically 0x5c should mean REVERSE
>  SOLIDUS, but not always so for humans.
>
>  Besides that, Unicode is not a panacea.  Some character set
>  (e.g. GB18030 for Chinese characters) is even bigger than Unicode.
>  In fact, GB18030 is a super set of Unicode.
>

I wonder how people who suggest Unicode as the single internal
encoding would react if GB18030 was suggested instead ;-)

Thanks

Michal