In article <3119E5AB-AEC8-4FEE-B2FA-8C75482E0E9D / sun.com>,
  Tim Bray <Tim.Bray / Sun.COM> writes:

> Yes, there are lots of others.  For example, a full-text indexing =20
> system dealing with a word like Qu=E9bec, which needs to index it the =20
> same whether the =E9 appears as one codepoint or two.

"=E9" is a character, even if it is represented as two
codepoints.

So ruby should treat it as a character.

I know current ruby doesn't do that.  But it is desirable.

NFC (Normalization Form C) can be a solution for "=E9".  But
there are characters which don't have single codepoint (as
some characters defined in JIS X 0213, for example).

I think codepoint is implementation details.  Although it
may be useful for unicode experts, non-experts will be
confused with the difference of characters and codepoints.
I think it should not be provided by default.
--=20
Tanaka Akira