At 01:24 08/10/21, David Flanagan wrote: >Tim Bray wrote: >> However, in Unicode, it's not ambiguous whether a character has the upper-case or lower-case property. What's ambiguous and locale-dependent and not even one-to-one is the mapping between the cases. If Ruby program text were defined as unicode, I suppose you could allow anything with the "Lu" property. >> Since Ruby is not limited to Unicode, and we don't know if Unicode and other character sets agree on the semantics of upper-case (I suspect not) it seems to me that only safe/portable definition of "upper-case" is [A-Z]. Excluding the somewhat complicated issue of the Georgian script (which in its modern form basically is caseless), I cannot currently immagine any case where upper/lower case semantics would differ in legacy encodings. It's always difficult for character issues to say "no, such a thing doesn't exist", but the majority of scripts don't have casing distinctions, and case issues have always been taken as very important when encoding characters in Unicode. So I think this is one of the safer, if not safest, areas of Unicode. The fact that only a few scripts have upper-case also means that Ruby Class names could be written in only a few scripts. Unless e.g. for Japanese, we want to come up with a convention such as Katakana for constants, Hiragana for variables :-). >The consensus seems to be to leave the current rules as they are, and I agree. > >I do want to point out, however, that it isn't just case that is ignored outside of the ASCII range. Any character outside of ASCII is considered a letter for the rules of identifier formation, for example. I don't think it would make sense to start paying attention to letter case for constant names without also paying attention more generally to whether a character was in fact a letter for identifier names. Good point. But rather than saying: All characters outside ASCII count as lower case, all codepoints (I guess it's not just symbols, but even unassigned codepoints) outside ASCII count as letters, I think it would be much better to recognize this as an imperfect intermediary state, with the documentation saying: Don't count on this to stay that way, it may get fixed in the future. Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst / it.aoyama.ac.jp