From: "Thomas Thomassen" <thomas / thomthom.net>
> James Gray wrote:
>> 
>> That's a good question.  I'm not sure what it does on Windows.
> Any clues what I does on OSX? The scripts will run on macs as well.

Unlike that other OS, both OS X and Linux have taken an approach
I like to refer to as, NOT MIND-NUMBINGLY STUPID.

In OS X and Linux, one can use the same API calls one has always
used, as they are now UTF-8 savvy.


>>> Windows' NTFS format supports UTF-16 encoding - would it work if I
>>> transcoded the strings from UTF-8 to UTF-16?
>> 
>> I think it depends on which API methods you call, so I'm guessing you
>> cannot do this.  I think Ruby would need to be changed to use those
>> methods first.
> Since NTFS supports UTF, then I guess it's the Ruby API that calls the 
> wrong WinAPIs?
> Can I make my own API calls?

In ruby 1.8 embedded into our C++ application, I've created hooks
so that I can call our unicode-savvy C++ routines from ruby.

I suppose it may be possible to do this without involving a
ruby C extension, assuming the ruby Win32API module can
be made to call routines like _wopen and such.  I haven't tried that.


> The scripts I write is plugins for Google Sketchup - so the Ruby version 
> I have at disposal is the one Sketchup bundles - a partial 1.8 version.
> While I've been searching for solutions I've noticed that v1.9 have 
> better support for various encoding, but unfortunately it's of no use 
> for me.
> 
> So my problem is that I have to deal with string data that comes from 
> Sketchup in UTF-8 format - might even have to deal with files and folder 
> that include characters outside the Windows1252 or ISO8859 range 
> (whatever the IO functions are using - I've not been able to pin-point 
> this.). If I get characters outside that range it's impossible to 
> transcode.
> Andd, I also don't know what would happen for an eastern user. I'm 
> wondering if the IO functions would assume a different 8bit encoding...

For best 8-bit compatibility you'll want to encode to Windows1252.

But, this (of course) won't help at all with chinese characters, etc.


Regards,

Bill