On Tue, Apr 8, 2008 at 7:40 AM, Thomas Kellerer
<YQDHXVLMUBXG / spammotel.com> wrote:
> Marc Heiler, 08.04.2008 13:24:
> > [Unicode]
> > > I need it.
> > I keep on reading people that need Unicode, and in your case it may very
> > well be true and for many others as well.
>  That's precisely the ignorant attitude that caused the issues we currently
> have with differen character sets. I'm pretty sure that if computer systems
> had been emerged from a non-english speaking country at the beginning we
> wouldn't need to still fight character set issues (there are still too many
> applications that even have problems with 8bit character sets)

Unfortunately Marc didn't keep *my* quote which has a lot more
important context than was kept. I never argued that people don't need
Unicode. I said that most people don't need Unicode munging of their
text. Think operations on the strings rather than the existence of the
strings themselves.

>  I'm a newbie with Ruby and until I read this discussion I simply assumed it
> would fully support Unicode "out of the box" especially given the fact that
> is originates from Japan. I'm actually very confused (not to say shocked)
> that there *is* a discussion if Ruby needs (or supports?) Unicode.
>  Unicode (and a relevant encoding such as UTF8) should be the *standard* for
> all (new) programming languages and not an exception.

No, they shouldn't. Yes, Ruby needs Unicode support. But Unicode has a
big problem: legacy data. There's more legacy textual data than there
is Unicode textual data at this point. That will change, yes, but it
isn't so now. Languages that assume that their strings are Unicode
(and make it harder to deal with legacy data) are much harder to work
with for legacy data.

Also, look at the Han Unification discussions regarding Unicode and
you'll see why the lack of Unicode support from Japan through early
this decade isn't surprising. Unicode isn't very friendly to Asian
texts, in terms of storage size. It's not a big deal now that we're
dealing with massive hard drives and our textual data is a minuscule
fraction of our overall data storage (audio, images, video).

Ruby 1.8 has limited (too limited, but there's good historical reasons
for this) support for Unicode; Ruby 1.9 has good support for Unicode
and it's getting better.

History is good to know. It would have prevented this blogger from
being ignorant.

-austin
-- 
Austin Ziegler * halostatue / gmail.com * http://www.halostatue.ca/
 * austin / halostatue.ca * http://www.halostatue.ca/feed/
 * austin / zieglers.ca