On Monday 29 September 2008 16:18:53 Bob Marley wrote: > How can I get a tally of how many characters in a Unicode string are > Japanese (hiragana, katakana, kanji)? When I unpack a string, each > character comes out like \xE3\x81\x95, but I am trying to check if it's > in the range 3040-309F (Hiragana) and I don't understand how to convert > between the 3-byte representation and that range... You may lookup the unicode mapping on google, but you will have to write new function for each possible encoding (UTF-8,UTF16LE...). Or, with ruby 1.9, you can iterate string by characters (not bytes), and use .ord function to get the unicode position number: mystr.each_char do |ch| puts ch.ord end Jan