On Wednesday 12 January 2005 09:19 am, Austin Ziegler wrote: | UTF-16 is better than UCS2 -- UTF-16 is UCS2 with surrogate | characters, and you don't really gain anything by using UCS4/UTF-32 | (in fact, you lose quite a bit of Huffman encoding). Most Western | languages can be efficiently presented in UTF-8; most Eastern | languages are more efficiently presented in UTF-16. (I've done quite | a bit of research on this.) | | Both UCS2 and UCS4 are deprecated encodings, and the only reason | that they're really in use at this point is because some filesystems | (NTFS) use UCS2 as their base encoding and cannot therefore support | UTF-16 encodings. | | I do believe that there was discussion regarding having a ByteVector | sort of class, too -- resulting in the sort of speed optimisations | necessary without impacting normal String use. A Blob class? Why would it be called a 'vector'? T.