On Sep 8, 2008, at 9:06 PM, Urabe Shyouhei wrote:

> Yui's point was that your unicode definition of "character" might =
=20
> differ
> from what programmers want to have.  Unicode's concept of codepoint=
s =20
> are
> not universal among those other encodings such as Shift_JIS.

Correct.

>  Ruby did
> not take unicode-centric architecture so something specific to unic=
ode
> might not always be adopted.  What you should show was how iteratio=
ns
> over codepoints are useful (among other encodings); hence his quest=
ion
> "When you use each_code?". Manfred's use case is one of those.

Yes, there are lots of others.  For example, a full-text indexing =
=20
system dealing with a word like Qu=E9bec, which needs to index it the=
 =20
same whether the =E9 appears as one codepoint or two.

Actually, for many programmers working in Unicode, what they need =
=20
isn't String#each_codepoint but IO#each_codepoint, because with =20
variable-length encodings it would be very nice if the library took =
=20
care of the necessary buffer juggling.

  -T