On May 7, 5:17 pm, Nanyang Zhan <s... / hotmail.com> wrote:
> Akbar Home wrote:
> > On May 7, 4:12 pm, akbarhome <akbarh... / gmail.com> wrote:
> >> > ۻ۰ Bruce Willis
>
> >> 
> >> Lee xiao ming
>
> > Sorry. Fixed version:
> > a.each {|x|
> >    if x[0].to_i > 128 then
> >      puts x.split(' ', 2)
> >    else
> >      puts x
> >     end
> > }
>
> > This code is quick and dirty.
>
> Thanks.
> But I was wrong. There are more Characters than Chinese and English that
> compose the strings. Now I see characters like , , ... if x is one of
> these, x[0]> 128 as Chinese does, but I only want to separate Chinese.
>
> so do you know what exactly range of the value Chinese Characters will
> return? or you can tell me where I can find this kind of information.
>
> --
> Posted viahttp://www.ruby-forum.com/.

These:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/197946
http://www.khngai.com/chinese/charmap/tbluni.php

should get you done.

ustr
=> +"ຬʦΤ"
irb(main):027:0> ustr[0]
=> U+6469 <CJK Ideograph>
irb(main):028:0> format "%X", ustr[0].to_i.to_s
=> "6469"
irb(main):029:0>