> A general hint for debugging encoding troubles: the UTF-8 encoding
> *guarantees* that every Unicode codepoint is *either* encoded into a
> *single* octet with its most significant bit cleared to 0 (i.e. a
> decimal value between 0 and 127) *or* into a *sequence* of 2 to 6
> octets, *all* of which have their MSB set to 1 (i.e. a decimal value
> between 128 and 255).

Question: The sequence of 2 to 6 octets: is it always even?  i.e. 2, 4, 
or 6 but not 3 nor 5 octects?

-- 
Posted via http://www.ruby-forum.com/.