On Sun, 19 Oct 2008 03:43:32 +1100, Yukihiro Matsumoto  
<matz / ruby-lang.org> wrote:


> As far as I know, the issues are:
>
>   * some case conversion does not map one to one (German eszett)
>   * some case conversion does not round trip (German eszett)
>   * some case conversion rely on locale (Turkish i)
>
> Are there any other issues?  How big are they?  Can they be ignored?

I believe that there are a few more issues eg: context sensitivity, a  
letter lowercases/uppercases differently when at the start or end of a  
word. I think it is also possible that a "title-cased" word (uppercase  
first letter and lowercase remainder) has a different "title" uppercase  
than the "normal" uppercase.

Please see http://unicode.org/reports/tr21/tr21-5.html

I think it would probably be wrong to try to do a 100% implementation of  
String#upcase, downcase & casecmp etc, because of the complexity and the  
fact that performance is likely to be poor. Perhaps ultimately someone  
could develop a library to deal with non-ascii case conversion specially,  
which could be used when necessary?

Cheers
Mike