Paul Brannan <pbrannan / atdesk.com> writes:

> On Mon, Jan 10, 2005 at 11:53:48PM +0900, Yukihiro Matsumoto wrote:
>> The "right" definition of characters differs application to
>> application.  That's the reason I don't add a Character class.  I want
>> to leave it to the user.

This sounds likely to result in duplicated efforts... Do it
pragmatically; I don't think it should be very hard to provide a
default Character class that people can "customize" by subclassing or
method redefinition.

> I don't understand what you mean here.  How is having "abc"[0] return a
> String a better solution than having "abc"[0] return a Character?  Is it
> less restrictive in some way?

I can't quite follow the line of reasoning here, too.

> Anyway, some questions:

I'll try to answer them from my view of a future Character class.

> 1. Will this be true?
>
>   ?a == "a"

No.  However, ?a === "a" should be true.

> It would allow code like this to be forward-compatible:
>
>   line = gets
>   if line[0] == ?A then
>     ...
>   end

This code will work nevertheless, because line[0] is a Character.

> 2. What will the encoding be of the character following the ? mark?  Can
>    I write:
>
>   if line[0] == ?<some utf-8 character> then
>
> or must I use a String instead?

The encoding of the string/char will be the same as the code file.
(See below.)

> 3. Can I compare two strings that have two different encodings?

Yes.  There probably will be need for a "fuzzy" matching, though...

> 4. Will $KCODE change to allow more encodings or will it be going away?

I'd propose to replace $KCODE with something a bit more elegant and
OO, as there will be need for

  - default encoding
  - source encoding (possibly file local)
  - stream encoding (possibly stream local)

A controversial issue: I strongly suggest utf-8 to be taken as the
default file encoding.  IMHO, this will not affect the user of other
encodings, as they need(ed) to specify the exact encoding in any case.

> 5. Can there be user-defined encodings (e.g. if some user wants to
>    provide utf-16)?

IMO, utf-16 should be provided in the core.  User-defined encodings
should be possible too, though.

> 6. Should String#encoding return a String or a Symbol?

How about the class that handles the encoding?


> Paul
-- 
Christian Neukirchen  <chneukirchen / gmail.com>  http://kronavita.de/chris/