On 6/14/06, Michal Suchanek <hramrach / centrum.cz> wrote:
> What I want is all methods working seamlessly with unicode strings so
> that I do not have to think about the encoding.

That will *never* happen. Even with Unicode, you have to think about
the encoding, because UTF-32 (the closest representation to the
Platonic ideal "Unicode" you'll ever find) is unlikely to be supported
in the general case. Matz's idea of m17n strings is the right one: you
have a "byte stream" and an attribute which indicates how the byte
stream is encoded. This will sort of be like $KCODE but on an
individual string level so that you could meaningfully have Unicode
(probably UTF-8) and ShiftJIS strings in the same data and still
meaningfully call #length on them.

You will *always* have to care about the encoding. As well as,
ultimately, your locale.

-austin
-- 
Austin Ziegler * halostatue / gmail.com * http://www.halostatue.ca/
               * austin / halostatue.ca * http://www.halostatue.ca/feed/
               * austin / zieglers.ca