Michal Suchanek wrote: > I guess this should give you what you want: > > irb(main):001:0> s = "ÂçÃÒ¼ã¶ò asdfaf sdgs" > => "\345\244\247\346\231\272\350\213\245\346\204\232 asdfaf sdgs" > irb(main):002:0> s.unpack "U*" > => [22823, 26234, 33509, 24858, 32, 97, 115, 100, 102, 97, 102, 32, > 115, 100, 103, 115] Michal, Thanks! Chinese character start from 4e00 to 9fa5 at the unicode table, and CJK symbols and punctuation range from 3000 to 303f. I just used my strategy combining this new way (unpack "U*") to identify Chinese, It picked out 100% Chinese phrases from the strings. (1000 strings are tested) All of you that have replied and helped, thank you! Enjoy! -- Posted via http://www.ruby-forum.com/.