Mike Calder <ceo / phosco.com> wrote: >As an outsider, and as an application programmer, if the language has to=20 >resort to special add-on libraries or different method names to handle=20 >different character data types, forget it. Actually, it's not really hard to hide almost all of that from the application programmer. Windows does it very nicely with it's TCHAR macro, _T(...) preprocessing magic, and it's renaming of API calls depending on whether or not the target platform supports 8 or 16 bits. >A string is a string is a string. As an application programmer I want to >b= >e=20 >able to use substring, split, character by positional index, and all the=20 >other standard string methods and not have to worry about what kind of >stri= >ng=20 >data I am handling. I'm prepared to pass a parameter to stream handlers=20 >telling them what type of encoding to use between external streams and=20 >internal representations if I have to - or even between different >internal= >=20 >representations if that's the sort of thing that is a particular >programmer= >'s=20 >bag. I could even live with parameters like that on the standard string=20 >methods if I had to. > >But the original code that I code and test to work with >&mylanguageofchoice= >;=20 >MUST work with ANY other language, without change. All other language >user= >s=20 >must have to do to get my program to work in their language is to >translate= >=20 >fixed strings (and perhaps change GUI layouts because of sizings). This >is= >=20 >absolutely basic and should be in the Ruby core. I thought this kind of=20 >thing was why we have OO. > >How you do it, I don't care. I don't care what the internal representation >= >is;=20 >that's an implementation detail, the sort of thing that OO should be >hiding= >=20 >from me anyway. All I know is I'm handling a string, and I want to do >stri= >ng=20 >manipulations. I don't care is it's Unicode or ASCII, Kanji or Kanuck. > >Sorry, if I can't do that in Ruby, it's broken and, I'm afraid, unusable. It's amazing (to me, anyway) that you wrote that, as that sentiment so closely mirrors my own initial attitude towards Ruby. The first time I was looking at Ruby and trying to decide whether or not I should invest time in learning it (which was months ago, at the very least), as soon as I found out that it didn't handle strings as 16 bits internally, I was *completely* put off. My opinion at the time was that any computer language that didn't support strings with 16 bit wide characters could not be taken seriously. (Yeah, a very snobby attitude, but people are like that sometimes.) But, you know what? I was totally wrong. Since Ruby strings *do* permit embedded '\0' characters, they give me all the flexibility I need. A string in Ruby is simply an sequence of *bytes*. If I want to create a 'Unicode' like sequence of 16 bit characters I can - all I need to do is initialise an array with the proper sequence of numbers and unpack that to a string. (And, of course, remember that I need to implement all operations on such a sequence based on accessing *2* byte sequences at a time.) If I want to do anything special with strings, there is nothing stopping me from writing my own string class... The beauty of the Ruby way is that it *can* deal with strings that are internally 16 bits wide or whatever, but still also *implement* that logic with what it's compiler sees as 8 bit wide character sequences (which means even IMO 'broken' C/C++ compilers can build a working Ruby system). Vive la OOP difference! ;-) Martin