On Monday 08 February 2010 03:05:18 pm Seebs wrote: > On 2010-02-08, Marnen Laibow-Koser <marnen / marnen.org> wrote: > > I haven't used 1.9 yet, so take this with a grain of salt, but my > > impression is that encoding-aware Strings that aren't byte arrays is > > exactly the right thing for Ruby to have. > > It is certainly a useful thing to have, but I'm not sure that it's a good > idea to do away with byte arrays. They're still around, they're just slightly ugly. You have to specify a weird encoding, something like ASCII-8BIT, to mark the string as raw. > The array type seems INCREDIBLY expensive for this -- do I really want > to allocate over two thousand objects to read in a 2KB chunk of data? If they're bytes, sure. I don't know enough about the Ruby internals to know if it's worse than a string -- it probably is, since arrays can hold arbitrary values -- but if it's an array of integers, remember that while integers behave like objects, they aren't actually allocated like objects. Ruby appears to be using an old Smalltalk trick here, in that a single bit in the object reference (itself an integer) signals whether this particular reference is an int or an actual reference -- thus, ints lose some precision, but gain a LOT of speed. On second thought, that is expensive -- an int is probably bigger than a byte -- but not _that_ expensive. But really, it seems to me that the answer here would be to follow python -- add a separate binary type. To be especially idiomatic, we could make strings, arrays, and binary data all have a similar duck-type. And the short-term answer is to use "raw" strings, because they're already used everywhere and they're already efficient.