MonkeeSage wrote:
> I guess we were talking about different things then. I never meant to
> imply that the regexp engine can't match unicode characters

Since regular expressions are embedded in the very syntax of ruby just 
as arrays and hashes, IMHO that qualifies as unicode support. So yeah, 
it seems like we have a semantic disagreement. :-(

> I, like Charles (and I think most people), was referring to the
> ability to index into strings by characters, find their lengths in
> characters

That is certainly *one* way of supporting unicodde but by no means the 
only way. My belief is that you can do most string manipulations in a 
way that obviates the need for char indexing & char length, if only you 
change your mindset from "operating on individual characters" to 
"operating on the string as a whole". And since regex are a specialized 
language for string manipulation, they're also a lot faster. It's a 
little like imperative vs functional programming; if I told you about a 
programming language that has no variable assignments you might think 
it's completely broken, and yet that's how functional languages work.

> to compose and decompose composite characters, to
> normalize characters, convert them to other encodings like shift-jis,
> and other such things.

Converting encodings is a worthy goal but unrelated to unicode support. 
As for character [de]composition that would be a very nice thing to have 
if it was handled automatically (e.g. "a\314\200"=="\303\240") but if 
the programmer has to worry about it then you might as well leave it to 
a specialized library. Well, it's not like ruby lets us abstract away 
composite characters either in 1.8 or 1.9... I never claimed unicode 
support was 100%, just good enough for most needs.

> just a difference of opinion. I don't mind being wrong (happens a
> lot! ;) I just don't like being accused of spreading FUD about ruby,
> which to my mind implies malice of forethought rather that simply
> mistake.

Yes, that was too harsh on my part. My apologies.

Daniel