On Sep 19, 2008, at 6:40 AM, Dave Thomas wrote:

> I'm no expert in any of this, but I wonder if part of the problem  
> might be that Ruby tries to support all encodings both internally  
> and externally. Might it be easier to support the full set  
> externally, but to have a more limited set internally? For example,  
> you could support UTF-16<any endian> as an external encoding, but  
> transcode to UTF-8 on the way in. You could still support a rich  
> variety of internal encodings, including the Asian ones you need.  
> But you wouldn't have to deal with UTF-16 when implementing  
> Regexp#escape :)  So, keep the current set of encodings, but only  
> allow a reasonable (ASCII-compliant) subset as internal encodings.

+1  -Tim