Hi,

In message "m17n information?"
    on 04/01/10, jason r tibbetts <tibbetts / acm.org> writes:

|I just read Matz's slides from (http://www.rubyist.net/~matz/slides/rc2003/) the
|2003 IRC, and I was happy to see that multilingualization ("m17n") is on the
|menu for Ruby 2.0. However, I couldn't find any specifics about it--how it's
|going to work, etc. I ask because I regularly work with multilingual corpora
|(NLP research), and I just had to scrap a concise Ruby script with a crummy Java
|tool when I had to work with Chinese (GB-1232-encoded) texts. I'd even offer my
|help towards the effort, if it's needed.

You can see the prototype from CVS ruby_m17n branch, but without any
documentation.  Its policies are

  * no implicit code conversion
  * user can define new encoding handler
  * requirements are a) encoding scheme should be stateless
    b) multibyte character length can be known from its first byte,
    which is not true for GB encoding.  The latter requirement will be
    removed from 1.9 implementation.

If you have any comment/question/request, feel free to tell me.

							matz.