John Joyce wrote:

> I don't know if the two main chinese sets are encoded as different
> ranges or simply declared in some way.
> In general in Unicode a character is the same character even when it
> appears in a different language.

Many characters of these two set of Chinese(in fact, including Chinese 
Characters in Japanese  and Korean...) are the same. Aren't they encoded 
to the same codes when they are identical?

Gary Thomas wrote:
> I believe the range is (in hex) 3400 to 97A5
You must mean Unicode range.
http://www.khngai.com/chinese/charmap/tbluni.php?page=0

John Joyce wrote:
> You might want to check the RubyGems gem   unihan
.... hmmmmm.. if only I could find out what it does...
John Joyce wrote:

> http://www.alanwood.net/unicode/index.html

> I've been interested in this subject myself, but it is a big one.

Interesting subject indeed it is.

Today I tried this(!!!!under RoR console!!!!):
>> c=%w{ ɡ                                  ݡ                       }
=> ["", "ɡ", "", "", "", "", "", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", " ", 
"", "", "", "", "", "", "", "ݡ", "", "", "", "", "", "", 
"", "", "", "", "", "", "", "", "", "", "", "", "", "", 
"", "", ""]
>> c.collect.map{|o| o[0]}
=> [226, 226, 239, 239, 239, 239, 239, 226, 239, 239, 239, 239, 239, 
226, 239, 239, 239, 228, 228, 229, 229, 229, 229, 229, 229, 229, 229, 
229, 229, 230, 230, 230, 230, 230, 230, 230, 230, 231, 231, 231, 231, 
231, 231, 231, 231, 231, 231, 231, 233, 233, 233, 233, 233, 233, 233, 
233, 233, 233]
>> c.collect.map{|o| o[0]}.sort
=> [226, 226, 226, 226, 228, 228, 229, 229, 229, 229, 229, 229, 229, 
229, 229, 229, 230, 230, 230, 230, 230, 230, 230, 230, 231, 231, 231, 
231, 231, 231, 231, 231, 231, 231, 231, 233, 233, 233, 233, 233, 233, 
233, 233, 233, 233, 239, 239, 239, 239, 239, 239, 239, 239, 239, 239, 
239, 239, 239]
>> c.collect.map{|o| o[0]}.sort.uniq
=> [226, 228, 229, 230, 231, 233, 239]

There punctuations are those commonly used in China.
There Chinese characters are randomly pickup from
http://www.khngai.com/chinese/charmap/tbluni.php?page=0
(from all the six pages.)

maybe 226 to 239 is the range I need.

-- 
Posted via http://www.ruby-forum.com/.