2008/9/19 Martin Duerst <duerst / it.aoyama.ac.jp>:
> At 09:42 08/09/18, Austin Ziegler wrote:
>>(see the discussions around Han unification for a brief
>>primer on the issues involved).
> Complaints about Han unification are mostly unjustified. The discussion
> e.g. around Internationalized Domain Names has shown that unification
> has significant advantages. You get into problems when e.g. a Latin
> 'A', a Cyrillic 'A', and a Greek 'A' are encoded separately (as they
> currently are, not the least because they are encoded separately in
> some important East Asian standards).
> I do not want to immagine the mess we would have if there were separate
> codes for Chinese/Japanese/Korean (and maybe Vietnamese, Taiwanese,...)
> "variants" of Han characters such as '' (one), '' (two), '' (three),
> and so on.

I'm not disagreeing with you in principle, but even if the complaints
are unjustified, the fact is that they exist and they slowed adoption
of Unicode in Asian countries pretty significantly.

>>The problem is exacerbated in the
>>academic arena where people want to be able to represent ancient
>>characters accurately, but it's not limited to that.
> Yes, and if you look at academic use, the same can be said for
> the Western World. As a simple example, Unicode doesn't contain
> codepoints for all the many ligatures used in the Gutenberg bible.
> The only difference may be that researchers in the West are
> more ready to use an additional layer (e.g. some XML markup or so)
> for this, whereas in Asia, the fact that there is already
> such a huge number of characters makes it very easy for people
> to think that just adding more characters is the solution
> for these problems.

It's also a little different for the Asian researchers because
different characters are different words. It may also be a display
problem; most Western language ligatures can be approximated on
computer displays with just a little tweaking of the display of two
characters, even if a separate glyph in a font is always better. This
isn't always possible with Asian language characters, by my
understanding.

Still, I am encouraged to see Ruby keeping m17n yet improving its
Unicode support.

-austin
-- 
Austin Ziegler * halostatue / gmail.com * http://www.halostatue.ca/
 * austin / halostatue.ca * http://www.halostatue.ca/feed/
 * austin / zieglers.ca