Hi,
In message "Re: [ruby-core:19383] Re: Constant names in 1.9"
on Sun, 19 Oct 2008 01:12:12 +0900, Tim Bray <Tim.Bray / Sun.COM> writes:
|However, in Unicode, it's not ambiguous whether a character has the
|upper-case or lower-case property. What's ambiguous and locale-
|dependent and not even one-to-one is the mapping between the cases.
|If Ruby program text were defined as unicode, I suppose you could
|allow anything with the "Lu" property.
Now I feel clearer now. It's not too difficult to use Lu property
when scripts are written in Unicode, but I don't think we need to
allow identifiers in multibyte characters that tend to cause problems.
I have no question about identifiers in Ruby programs. But I have
some question in string case conversion.
Some languages allow string non ASCII case conversion in Unicode
(perhaps ignoring Turkish case). I myself sometimes want these
functionality to normalize full-width alphabets used in Japanese.
European people would have bigger needs for them.
As far as I know, the issues are:
* some case conversion does not map one to one (German eszett)
* some case conversion does not round trip (German eszett)
* some case conversion rely on locale (Turkish i)
Are there any other issues? How big are they? Can they be ignored?
How other languages treat them?
matz.