On 6/28/06, Izidor Jerebic <ij.rubylist / gmail.com> wrote:
>
> On 28.6.2006, at 18:48, Austin Ziegler wrote:
>
> > There is *no possible good argument* for separating ByteArray from
> > String in Ruby. Not with what it would do to the rest of the API, and
> > I don't think that anyone who wants a ByteArray is thinking beyond
> > String issues.
>
> Oh, really? So it is OK for this code to sometimes receive binary
> String and sometimes String with encoding:
> io = SomeIO.open( .... )
> v = io.read( 1000 )
>
> This is the most problematic part of String handling. Because if my
> code expects this 'v' to be binary string, v[0..15] is the first 16
> bytes (maybe a message header or something). If this is encoded
> string (because some setting changed outside of my code), v[0..15]
> will be some random amount of data.
>
> This is the error that happens right now and will happen in the
> future also, if the rules are not clear.

I would think that STD* should use locale (or equvialent) for default
encoding. So should popen. And open should use locale to determine the
encoding of *file names*. This migt be different from the encoding of
STD* (ie on Windows).

For file io it might be reasonable to set the default encoding from
locale as well. However, there is  no reason why the files should
contain text. So to make things clear the io should be binary by
default for files, network, and anything else (except the pipes
mentioned above).

For short scripts one could change that by assigning some global that
specifies the default encoding. For anything else it is reasonable to
demand that everybody sets the encoding when calling open. Even issue
a warning about that. If you want to know what encoding you get there
is not other way.
And it is not addding complexity. today you do not specify encoding
but you also do not get anything that deals with it.

Thanks

Michal