On 8/1/06, Daniel DeLorme <dan-ml / dan42.com> wrote:
> Paul Battley wrote:
> >> > Actually, that's a really good idea. Which languages/frameworks have
> >> > you found that actually do it right? We could learn from their
> >> > example.
> >>
> >> To my knowledge you are intimately familiar with the subject so I
> >> take it as sarcasm.
> >
> > I'm not being sarcastic at all, though perhaps I could have phrased it
> > better. It's just that all Unicode discussions in Ruby end up going
> > round and round in circles; if we as a community could identify some
> > first-class examples of Doing It Right, I think we'd have some useful
> > yardsticks. You are someone with particularly high expectations
> > (rightly so) of Unicode support in a language: have you found anything
> > that really impressed you?
>
> I second that. I see a lot of people asking for "transparent" unicode support
> but I don't see how that is possible. To me it's like asking for a language that
> has transparent bug recovery. I know that ruby has weaknesses when it comes to
> multibyte encodings, but the main problem is human in nature; too many people
> assume that char==byte, which results in bugs when someone unexpectedly uses
> "weird" characters. IMHO no amount of "transparent support" will change that.
> But I would love to be shown otherwise with examples of languages that "do it
> right".
>
By transparent I mean that I can iterate, compare, match, index, ...
not only bytes but also at least code points (and grapheme clusters if
somebody is so nice and implements that - but for me it is not very
important now). Using the standard string class that all standard
functions accept.

In ruby 1.8 working with anything but bytes is like scratching your
right ear with your left hand .. or leg.

Thanks

Michal