Dave Thomas <Dave / PragmaticProgrammer.com> writes:

> matz / ruby-lang.org (Yukihiro Matsumoto) writes:
>
>> No.  Considering multilingualization issues, character class is a
>> gate to the hell.  No one agrees with each other.  I'd like to
>> leave Char class to the users.
>
> Matz:
>
> I know absolutely nothing about m17n, so forgive me if this is
> naive. In Unicode, I believe that it is possible for two
> 'characters' to be equal even if their bit patterns are different:
> there are different ways of representing the ways that the
> characters are constructed. Might that be an argument in favor of a
> character class, so that
>
>    str.char_at(i)  #=> aChar
>
> and aChar retains sufficient information from the String to allow
>
>    aChar == otherChar
>
> to be evaluated correctly?

Unicode deals with numbers, so the problem doesn't exist there.

You are probably thinking of UTF8, which does allow the same Unicode
number to be encoded multiple ways.  But I assume a String#char_at(i)
method would return the Unicode number itself, not a UTF8 byte
sequence.


> I know you live with this nightmare daily, and you've thought this
> through a lot more than I have, but as the encodings get more
> complex, doesn't the need for Char increase. After all, there are
> both sins of commission and sins of omission, and both lead us to
> the same gate. :)

So long as String#char_at can return a 4 byte numerical value, I think
it pretty much covers everything.

-- 
matt