"Tobias Peters" <tpeters / uni-oldenburg.de> a ˝─rit dans le message de
news:Pine.LNX.4.44.0307221251240.3017-100000 / localhost.localdomain...
> On Sun, 20 Jul 2003, renoX wrote:
>
> > Hello,
> > I'm a newbie in Ruby and I have a problem with character encoding.
> >
> > I use Ruby to list the file contained in a directory in Windows.
> > Those filenames contains non-ASCII character (French accent,etc)..
> >
> > I'd like to put the name of those file in an XHTML webpage.
> > I have two possibilitties:
> > - I transcode from the 'native Windows encoding' to UTF-8.
> > or - I declare in the XHTML file to be in the encoding used by Ruby on
> > Windows.
> >
> > My preferred solution would be do the transcoding, but I don't know what
is
> > the encoding returned by Dir.each...
>
> This is a can of worms, and it goes far beyond the ruby language. To
> handle character encodings correctly, each character string as well as
> each source and each sink of characters needs an :encoding attribute. This
> is planned for ruby 1.9/2.0.

This explains why I didn't found any information on the subject..

> Since you work on windows, you might be able to get the necessary
> information (encoding of filesystem names) from the OS (but don't ask me
> how), since, AFAIK, NTFS stores file names in Unicode (UTF16 or UCS2, I
> think) and Windows API transcodes filenames from / to the default
> character encoding for your application. FAT filesystems store filenames
> in some default encoding, and windows might know which and translate as
> needed (i am not sure).

I'll try UTF16 or UCS2, and if it doesn't work, I'll look at the sources of
Dir.each to see what system call it use, in a windows programming group they
should be able to tell which format it uses, apparently the library
documentation is a "bit high level" (at least on Windows) on these
"details".

> On Linux, the situation is far worse, since in EXT2 / EXT3, filenames are
> just strings of _bytes_ (only 0x0 and 0x2f are disallowed).
> I have no clue how ruby (or any other program) could determine the
> encoding of filenames there.

Well I think that Ruby (or any other program) cannot know, but that's up to
the programmer to know..
The strange thing (for me) is that apparently there is no library in Ruby to
do the transcoding.
Of course it'd be nice when it will be incorporated into the language,
 but I'm surprised that in the meantime there is no library to do the same
purpose,
I'll check again in the RAA to see if I missed it the first time.

Thanks a lot for your help.
RenoX

>
>   Tobias
>