Issue #2034 has been updated by Yui NARUSE.


Michael Friedman wrote:
> Hi.  I'm a newcomer to Ruby - studying it right now - but I've been writing multi-lingual systems for 15 years.  I think I can shed some light on internationalization issues.
> 
> First, I have to say that I was pretty amazed when I discovered that Ruby is not either a multi-character set system or native Unicode.  I just assumed that since it comes from Japan and is a relatively new language multi-byte and Unicode support would have been automatic for its developers.  Well, that's life and you can't go back in time and change things.

Thank you for interesting to Ruby.
But you seems use Ruby 1.8 (it has limited support to multi byte characters).

If you want m17n supports, use 1.9 and read following documents.

* http://yokolet.blogspot.com/2009/07/design-and-implementation-of-ruby-m17n.html
* https://github.com/candlerb/string19
* http://blog.grayproductions.net/categories/character_encodings
----------------------------------------
Feature #2034: Consider the ICU Library for Improving and Expanding Unicode Support
http://redmine.ruby-lang.org/issues/2034

Author: Run Paint Run Run
Status: Assigned
Priority: Normal
Assignee: Yui NARUSE
Category: M17N
Target version: 2.0


=begin
 Has consideration been recently given to employing the ICU library (http://site.icu-project.org/) in Ruby? The bindings are in C and the library mature. My ignorance of the Ruby source not withstanding, this would allow existing String methods, among others, to support non-ASCII characters in an incremental manner. 
 
 For a trivial example, consider String#to_i. It currently understands only ASCII characters which represent digits. ICU provides a u_charDigitValue(code_point) function which returns the integer corresponding to the given Unicode codepoint. Were String#to_i to use this, it would work with non-ASCII counting systems, thus removing at least one of the "as long as it's ASCII" caveats currently associated with String methods.
 
 More generally, if it's desirable for String methods to properly support Unicode, and if the principle barrier is the difficulty of the implementation, then might there be at least a partial solution in marrying Ruby with ICU?
 
 If ICU is unfeasible, I'd appreciate understanding why. There are multiple approaches to what I term the second phase of Unicode support in Ruby, and it will be easier to choose between them if I understand the constraints. :-) (Of course, if a direction has already been determined, and work on it is underway, I will gladly bow out ;-)).
=end



-- 
http://redmine.ruby-lang.org