On 17/09/2008, Yukihiro Matsumoto <matz / ruby-lang.org> wrote: > Hi, > > In message "Re: [ruby-core:18663] Re: Character encodings - a radical suggestion" > > on Wed, 17 Sep 2008 23:09:32 +0900, Matthias Wächter <matthias / waechter.wiz.at> writes: > > |Is there a complete characterization of this whole problem? It seems > |to be the main reason for sticking to non-UTF-8 character sets in > |Ruby these days, and concluding from what I have read about it, a > |solution could be the addition of missing characters/codepoints to > |Unicode. Why does no-one consider going that way, but instead builds > |a complicated stack of functions for conversions on top level? > > > Just because it's impossible. History sucks. We have mixed up YEN > SIGN and REVERSE SOLIDUS for long time. They cannot be distinguished > without context information. Technically 0x5c should mean REVERSE > SOLIDUS, but not always so for humans. > > Besides that, Unicode is not a panacea. Some character set > (e.g. GB18030 for Chinese characters) is even bigger than Unicode. > In fact, GB18030 is a super set of Unicode. > I wonder how people who suggest Unicode as the single internal encoding would react if GB18030 was suggested instead ;-) Thanks Michal