From: Yukihiro Matsumoto [mailto:matz / ruby-lang.org]
Sent: Wednesday, June 14, 2006 9:35 AM
> Hi,
> 
> In message "Re: Unicode roadmap?"
>     on Wed, 14 Jun 2006 14:26:30 +0900, "Victor Shepelev"
> <vshepelev / imho.com.ua> writes:
> 
> |I suppose, all we (non-English-writers) need is to have all string-
> related
> |methods working. Just for now, I think about plain testing each string
> |method;
> 
> In that sense, _I_ am one of the non-English-writers, 

Sorry, Matz, I know, of course. But I know too less about Japanese to see
how close our tasks are. Under "non-English-writers" I, maybe, had to say
"European languages" or so - which has common punctuations, LTR writing,
"words" and "whitespaces" and so on. I have almost no knowledge about
Japanese, Korean, Arabic, Hebrew people needs.

> so that I can
> suppose I know what we need.  And I have no problem with the current
> UTF-8 support.  Maybe that's because Japanese don't have cases in our
> characters.  Or maybe I'm missing something.  

Just what I've said above. 

> Can you show us your
> concrete problems caused by Ruby's lack of "proper" Unicode support?

As mentioned in this topic, it's String#length, upcase, downcase,
capitalize.

BTW, does String#length works good for you?

Moreover, there seems to be some huge problems with pathes having Russian
letters; but I'm really not convinced, if Ruby really has to handle this.

> 
> |also, some other classes can be affected by Unicode (possibly
> |regexps, and pathes). Regexps seems to work fine (in my 1.9), but pathes
> are
> |not: File.open with Russian letters in path don't finds the file.
> 
> Strange.  Ruby does not convert encoding, so that there should be no
> problem opening files, if you are using strings in the encoding your OS
> expect.  If they are differ, you have to specify (and convert) them
> properly, no matter how Unicode support is.

Oh, it's a bit hard theme for me. I know Windows XP must support Unicode
file names; I see my filenames in Russian, but I have low knowledge of
system internals to say, are they really Unicode?

If not take in account those problems, the only String problems remains, but
they are so base core methods!

V.