On Mon, 15 Dec 2008 20:12:59 +1100, Yukihiro Matsumoto <matz / ruby-lang.org> wrote: > You're right. When we have two strings with identical byte sequence > but different encodings, we have to tell they are different. The > comparison result does not matter much, so I used encoding index. > Is there any alternative choice that makes sense? It probably doesn't make sense to try to order 2 strings of incompatible encoding, so what you have done is probably is as good as anything else. The only real alernative is to raise an Encoding Compatibility error, but that is not a good idea either, I think, because I believe you would want s1 == s2 to return false rather than an error on incompatible encodings. So if you consider String#<=> as the "base" for all the string comparison methods (whether implemented that way or not), then to be consistent with "==" it would have to return a value for all possible encodings of s1 & s2, compatible or not, which implies that String#>, < etc must all return a value also. By the way, I think I phrased my description of String method implementations badly. I meant to say that Strings are stored as the bytes of their representation in their encoding, not as an array of codepoints. There *are* some methods which must convert characters to codepoints for their implementation, but this happens "on the fly". Many common String methods (eg: concatenate) operate directly on the bytes without the need to convert to codepoints. I should also have pointed out that Ruby goes to a lot of trouble to optimize methods operating on single-byte character strings in order to keep their performance good. Mike