On 08/04/2008, Thomas Kellerer <YQDHXVLMUBXG / spammotel.com> wrote:
> Bill Kelly, 08.04.2008 14:51:
>
> > Anyway, ruby 1.8 *does* have usable UTF-8 support "out of the box."  (See
> also
> > post #9 in that thread by Matz talking about 1.8.)
> >
>
>  Hmm. After reading the thread, I simply tried:
>
>  puts "öäü".length
>
>  and it returns 6 if the source file is saved with UTF8, which is plain
> wrong. (it returns 3 if saved in ISO-8859-1 encoding).
>  String#size returns the same values.
>
>  In case your mail reader does not display the string correctly - it's:
> &ouml;&auml;&uuml;
>

It's completely correct. length in 1.8 means number_of_bytes. You can
get 3 by using regexps in utf-8 or a special extension which is
mentioned in many threads on Unicode as well.

Thanks

Michal