On 3/14/06, Bill Kelly <billk / cts.com> wrote:> From: "Austin Ziegler" <halostatue / gmail.com>> > On 3/13/06, Anthony DeRobertis <aderobertis / metrics.net> wrote:> >>         UTF-8 can take more than one octet to represent a> >>         character; UTF-16 can take more than two; UTF-32> >>         more than four; etc.> > No. UTF-32 does not have surrogates. Unicode is perfectly> > representable in either 20 or 21 bits. A single character is *always*> > representable in a uint32_t sized space with UTF-32.> Hi, I have zero background in non-ASCII character representations,> but the following post has been echoing in my head as a data point> for... can't believe it's been three-and-a-half years:>> http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/46284>> Does that have any relation to your current context?  Curt seems to> be talking not of surrogates, but saying "combining characters"> mean variable-length issues still exist with UTF-32 ?
Yes and no. When you use combining characters, each of the combiningcharacters (such as COMBINING CEDILLA or COMBINING ACCENT ACUTE) is adistinct character. If I understand the Unicode standard correctly --which is perhaps questionable -- you can go either direction. But Ihad forgotten (temporarily) about combining characters. For the mostpart, Apple chooses to use them and Microsoft chooses not to use themin native representations wherever possible. Where it becomesdifficult is when you need to combine characters that do not otherwisehave canonical forms. At *that* point, yes, UTF-32 can have multipleuint32_t elements creating one character. I think that for mostlanguages, though, the use of combining characters is not necessary.
I withdraw my absolute, though. If you're creating a meaningful glyphwith combining characters, you *can* have multiple uint32_t elementscreating that glyphn in UTF-32. Without combining characters, however,UTF-32 is perfectly representational of all glyphs possible withUnicode.
-austin--Austin Ziegler * halostatue / gmail.com               * Alternate: austin / halostatue.ca