On Monday 08 February 2010 03:05:18 pm Seebs wrote:
> On 2010-02-08, Marnen Laibow-Koser <marnen / marnen.org> wrote:
> > I haven't used 1.9 yet, so take this with a grain of salt, but my
> > impression is that encoding-aware Strings that aren't byte arrays is
> > exactly the right thing for Ruby to have.
> 
> It is certainly a useful thing to have, but I'm not sure that it's a good
> idea to do away with byte arrays.

They're still around, they're just slightly ugly. You have to specify a weird 
encoding, something like ASCII-8BIT, to mark the string as raw.

> The array type seems INCREDIBLY expensive for this -- do I really want
>  to allocate over two thousand objects to read in a 2KB chunk of data?

If they're bytes, sure. I don't know enough about the Ruby internals to know 
if it's worse than a string -- it probably is, since arrays can hold arbitrary 
values -- but if it's an array of integers, remember that while integers 
behave like objects, they aren't actually allocated like objects. Ruby appears 
to be using an old Smalltalk trick here, in that a single bit in the object 
reference (itself an integer) signals whether this particular reference is an 
int or an actual reference -- thus, ints lose some precision, but gain a LOT 
of speed.

On second thought, that is expensive -- an int is probably bigger than a byte 
-- but not _that_ expensive.

But really, it seems to me that the answer here would be to follow python -- 
add a separate binary type. To be especially idiomatic, we could make strings, 
arrays, and binary data all have a similar duck-type.

And the short-term answer is to use "raw" strings, because they're already 
used everywhere and they're already efficient.