MonkeeSage wrote: > I guess we were talking about different things then. I never meant to > imply that the regexp engine can't match unicode characters Since regular expressions are embedded in the very syntax of ruby just as arrays and hashes, IMHO that qualifies as unicode support. So yeah, it seems like we have a semantic disagreement. :-( > I, like Charles (and I think most people), was referring to the > ability to index into strings by characters, find their lengths in > characters That is certainly *one* way of supporting unicodde but by no means the only way. My belief is that you can do most string manipulations in a way that obviates the need for char indexing & char length, if only you change your mindset from "operating on individual characters" to "operating on the string as a whole". And since regex are a specialized language for string manipulation, they're also a lot faster. It's a little like imperative vs functional programming; if I told you about a programming language that has no variable assignments you might think it's completely broken, and yet that's how functional languages work. > to compose and decompose composite characters, to > normalize characters, convert them to other encodings like shift-jis, > and other such things. Converting encodings is a worthy goal but unrelated to unicode support. As for character [de]composition that would be a very nice thing to have if it was handled automatically (e.g. "a\314\200"=="\303\240") but if the programmer has to worry about it then you might as well leave it to a specialized library. Well, it's not like ruby lets us abstract away composite characters either in 1.8 or 1.9... I never claimed unicode support was 100%, just good enough for most needs. > just a difference of opinion. I don't mind being wrong (happens a > lot! ;) I just don't like being accused of spreading FUD about ruby, > which to my mind implies malice of forethought rather that simply > mistake. Yes, that was too harsh on my part. My apologies. Daniel