Yukihiro Matsumoto wrote:
> I am not sure what you expect about separation, but I doubt separation
> would make above code to "be more logical and break far less".

Just jumping into the discussion here, I have to agree with Matz. A char-vector 
is simply a higher-level representation of a byte-vector, not different enough 
to warrant two entirely separate classes.

I think the real issue is not technical but rather a problem of perception and 
education. Ever since C-style strings, programmers have learned to view a string 
as an array of chars. So when we need to do char-string manipulation, we resort 
to pointer arithmetic when it fact the "correct" and ruby-native way of 
manipulating strings is with regular expressions. Instead of giving in to this 
old string-as-array mentality, maybe we should teach people to use regular 
expressions? Hmmm, probably impossible.

A string can be interpreted as both a sequence of bytes or a sequence of 
characters, but the methods can be confusing. Obviously, upcase and downcase are 
operations at the character level, but what is [] supposed to do? From the ruby 
point of view, str[0..3] gives you the first 4 bytes and str.scan(/^..../) gives 
you the first 4 characters. But for the majority with the string-as-array 
mentality, [] is ambiguous; does it give you access to the bytes or to the 
characters of the string? In the interest of facilitating education, there needs 
to be a clear disambiguation; instead of str[0..3] it should be str.byte(0..3) 
and str.char(0..3) -- with maybe the latter one giving a warning along the lines 
of "use regular expressions!" ;-)  That way the ambiguity between byte-vector 
and char-vector could be resolved.

Daniel