On Thu, 1 Aug 2002, Alexander Bokovoy wrote:

> On Thu, Aug 01, 2002 at 09:55:48PM +0900, Curt Sampson wrote:
> > On Thu, 1 Aug 2002, Alexander Bokovoy wrote:
> >
> > > Unicode 3.1 is 32-bit wide.
> >
> > I have just looked at my 3.0 standard and the 3.1 and 3.2 updates on the
> > web site, and I do not see any evidence of this. Did I miss something?
> > See the message I just posted for the details as I know them.
>
> http://www.unicode.org/unicode/reports/tr19/tr19-9.html :
> [section 3, Relation to ISO/IEC 10646 and UCS-4]

Actually, I was looking for someone to attack my argument, not
support it. :-)

What this says is that they are removing some private code areas
in the ISO 10646 UCS-4 encoding so that it will become smaller and
compatable with UTF-32. And, as it says at the beginning of that
document:

    UTF-32 is restricted in values to the range 0..10FFFF16, which
    precisely matches the range of characters defined in the Unicode
    Standard (and other standards such as XML), and those representable
    by UTF-8 and UTF-16.

So Unicode is not 32-bit in any sense of the word. A character in
the UCS-32 encoding of Unicode takes up 32-bits, but many of those
bits are unused.

> > Mojikyo wants to give maximum flexability in the display of Chinese
> > characters. Given the number and complexity of kanji, these two aims are
> > basically incompatable.
>
> I still don't see why both goals should be incompatible a priori. But this
> is possible an offtopic here. :)

Partly efficiency concerns. As the speed of CPUs increases relative
to memory, the relative cost of string handling (which is pretty
memory intensive) gets higher and higher. And also things like ease
of use; avoiding duplications makes things like pattern matching
and use of dictionaries much easier. (Imagine, for example, that
ASCII had two 'e's in it, and people used one or the other randomly,
as they liked. Now instead of writing s/feet/fleet/, you have to
write at least s/f[ee][ee]t/fleet/, or in certain fussy cases even
s/f([ee][ee])t/fl\1t/. Ouch.

cjs
-- 
Curt Sampson  <cjs / cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC