> internationalization = i + 18 letters + n = i18n.
>
> So, m17n obviously means Matsumototification.

Thanks! :-))))))))))))))))) Why didn't I count?

That's called the Steigen-effect! No one tells you what everybody already
knows. (from a little place in the north of Norway, where the roads have no
names, and only newcomers can't find their way.)

Is there a ref doc on m17n? A brief description?

> I wonder who will be the first to derive a Utf8String class based on this.

Desired behaviour? ... Or do you mean just that the indexing gives
letter-wise, and not byte-wise results?

> Also, I point out that even though Python allegedly supports Unicode,
> and Tkinter supports Unicode, and Windows 98 supports Unicode, IDLE 0.6
> breaks in inexplicable ways when I try to print non-Latin1 characters
> to the python shell.

OT:-)
You shold always print _only_ utf-8 encoded output:
print unicodestring.encode("utf-8")
then Tk is somewhat happy, as long as the selected font in the middle of
some IDLE module, does support the characters you are interested in. I
recommend f eks "Ariel Unicode MS" in windows. Linux? The esperantic though,
is a part of every modern windows UC-font... "MS Gothic" i think is
Shift-JIS, so it won't even work for Ruby itself backslash == Yen.

One thing I don't understand. If you support UC like in current Python, when
programming... what do you se in your string literals (most writing tools
support utf-8)?... nothing readable. Then you will have to convert between
uc 16-bit and utf-8. All the time. Messing up your code. Then the regex-es
knows nothing of wordboundaries in general UC (or did I do it wrong). Then,
they say that Python (and Perl) represents UC-strings as utf-8. And then
what?

So "supporting" unicode is more than just adding a string-type.

The current string module in Ruby is not just "inspired by western culture",
but many of the functions are supporting *english only*. Just adding on is
not good enough. As long as we go for *english only* as a model for i18n, we
can freely mix issues of *encoding*, *string*handling and *language
specific* support/methods.

Someway I think I would prefer a separation of encoding, strings, and
language issues. The merged version now to be found in any computer language
is just making the i18n theme much less transparent than it actually is.

In Ruby there seems to be a chance to do this the Rubian way. So what is the
Rubian Way, anyway?

The fact that Ruby is widely used for non-english, non-western-lingual
language processing might be beneficial. But when will I be able to read all
these Nihon-go docs?

amike via Henning.

BTW I heard that Klingon is to make it to the UC Standard:-)