On 22.6.2006, at 10:17, Yukihiro Matsumoto wrote:

> In message "Re: Unicode roadmap?"
>     on Thu, 22 Jun 2006 15:55:18 +0900, "Lugovoi Nikolai"  
> <meadow.nnick / gmail.com> writes:
> |> I am eager to hear.
> |
> |So what will be semantic for encoding tag:
> | a) weak suggestion?
> | b) strong assertion?
>
> Weak suggestion, if I understand you correctly.
>
> |I'd prefer encoding tag as strong assertion, mostly for  
> reliability reasons.
>
> Hmm, your idea of combination of strong assertion and automatic
> conversion seems too complex for me, but it may be worth considering.

Strong assertion + auto conversion is the only solution which will  
relieve programmers from manually checking/changing string encodings  
in their programs.

Remember, string input/output points in a program are not only system  
IO classes, but also all the third party libraries/classes which deal  
with strings. So most of the existing Ruby and other external (e.g.  
Java) libraries, which can be used from Ruby.

The assumption that only system IO is the entry/exit point for string  
encoding is very wrong. This assumption holds only for scripts which  
use no third party libraries.

So we have two possibilities:
a) every programmer is forced to implement the above solution in  
every program (this is starting to happen already, and current  
experience tells us that the future in this direction is disaster!)
b) Ruby interpreter implements this solution, and programmers happily  
ignore all the complexity.

So, it is true that we move the complexity into Ruby, but this is  
(IMHO) much less complicated and much more needed than e.g.  
infinitely big integers which we already have.

If Ruby wants to move forward, it needs transparent String support  
and hopefully separation of String and ByteArray, since this un- 
separation brought us code which is mostly wrong (currently most of  
existing Ruby code breaks if string encoding is honoured, as can be  
seen from experience of brave people who modified String class).

Ruby is my favourite language, and if it would have String support as  
suggested, software development would be just pure joy...

Please listen to the people which tell of disastrous experience in  
other languages. And for good experience, I develop in Cocoa in Mac  
OS X for many many years, and it has great String class (ok, the  
suggested Ruby class would be even better, but still). Plus it has  
separated String and Byte array. The results are superb. There is no  
problems, and nobody ever worries about strings and encodings. Ever.  
You can check the mailing lists.


izidor