Yukihiro Matsumoto wrote: > You said Tcl has Unicode support that works well with you. So that I > think treating all of them in UTF-8 is OK for you. It's actually not about treating everything in UTF-8, it just unifies everything in Tcl in a way that you can have all variety of characters in strings. > Then how can it > determine which should be in the current code page, or in Unicode? > Or using Win32 API ending with W could allow you living in the > Unicode? Well, currently (just downloaded latest cvs sources) ruby uses ansi versions of CreateFile and FindFirstFile/FindNextFile APIs, so even if I say, for example, KCODE to UTF-8 (not sure how you can currently make ruby work with UTF-8) ansi versions of APIs are still called, and that means that 1) if there are filenames with characters that don't fall in range of current codepage, I will receive '?' in place of them when I enumerate directory contents. 2) I receive filenames in current code page, and not in UTF-8 3) There is no way for me to open a file with these characters using standard ruby classes The same with win32ole extension, I can see a lot of ole_wc2mb/ole_mb2wc there, which breaks things horribly when interoperating with, for example, Excel and trying to work with russian/greek/japanese and all other languages all on the same sheet (after I process the sheet, modifying all of the cells, it will just strip all languages except russian from it). In *nixes you can just change your locale to *.UTF-8 and you're ok with that, because everything you receive when enumerating directory is UTF-8, and File.open will expect UTF-8. Unfortunately, for Windows that is not possible: MS already provides 'wide' versions of APIs for those who need them, and there is no UTF-8 ANSI codepage you can set as default (because UTF-8 codepage in Windows is somewhat 'virtual', for conversion purposes only). In Tcl you have all of your strings in UTF-8, and when Tcl interoperates with the rest of the world, it converts strings appropriately (for example, on Win9x there are mostly no 'wide' APIs, so it converts strings to current code page and uses ansi APIs, but on WinNT it converts it to unicode and uses 'wide' APIs). What I was thinking is maybe a way for setting "current codepage" for ruby on win32 (including possibility to set it to UTF-8), and so that when ruby works with the world it would use 'wide' APIs when possible, converting to and from this codepage (so that instead the way it is Tcl when it is hard-coded to be UTF-8, there would be a possibility to choose), because there are no other way to do that on Windows by user (user can't set current codepage to UTF-8). -- Posted via http://www.ruby-forum.com/.