On Wed, 26 Jun 2002, Jan Witt wrote:

> As I see it, the Unicode effort has been deeply
> misguided right from the beginning....

I think that you're deeply misguided right from the beginning about
what Unicode is supposed to do. :-)

> (1) in many languages there are more glyphs than
>     letters in the alphabet, e.g. because of ligatures,
>     i.e. letters that get intertwined with their
> neighbors.( take Hindi or Arabic as examples)
>    Unicode does not cater for this.

Nor is it supposed to. These are typesetting issues, not data
issues. The word "fish" contains the same letters whether or not
you use a ligature for the "fi".

> (2) Diacritics are not everywhere as simple as
>     accents in French, umlauts in German , which
>     luckily could be fit into Latin-1.

So? Unicode deals with a *lot* of diacritical marks. (Take a look
at the Vietnamese support, for example.) Where exactly does Unicode
fall down in supporting diacriticals?

> (3) Some languages are written from left to right,
>     some from the top down and texts may be mixed.

Unicode supports mixed-direction writing.

> Please consider that a multilingual text editor
> must know about the [possibly varying] glyph bindings
> of all of its
> languages.

Not really. I get by just fine with an editor that cannot generate
"proper" (in print terms) glyphs for "fi", "fl", "ffl", and so on.
I suspect most others do, too.

> (5) Japanese, as you probably know, has the rich
> choice of kanji characters and the two kana alphabets,
>     but no ligatures.

Since I know a little bit of Japanese, I'd be particularly interested
in what you think the Unicode problems are in relation to Japanese.

> (6) Collating sequences are a nontrivial issue.
>     In classical Spanish, e.g. LL and CH are
> considered
>     separate characters.

Unicode does not specify any collating sequences.

cjs
-- 
Curt Sampson  <cjs / cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC