Hello ruby-core,

Today I was playing around with manipulating strings containing
binary data, i.e. "\xaa\xab\xac\xad\xae" and using the new String
methods available in Ruby 1.9.

The exercise I was trying out was to extract out a range of bytes as
Fixnums. Kind of like String#bytes but I was only interested in a
subarray. Like so:

"\xaa\xab\xac\xad\xae".bytes.to_a[1,3] # =3D> [171, 172, 173]

This works but operates on the entire data set which I imagine is
fairly expensive... So I chopped it up before hand like so:

"\xaa\xab\xac\xad\xae"[1,3].bytes.to_a
=3D> [171, 172, 173]

Much better. This works fine when using the ASCII-8BIT encoding.
As soon as you use something like UTF-8 it fails because the []
method now works on characters instead of bytes.

data =3D "\xc3\xa9\xc3\xa9" # =3D> "\xC3\xA9\xC3\xA9"
data.force_encoding("utf-8") # =3D> "=C3=A9=C3=A9"
data[0,2].bytes.to_a  # =3D> [195, 169, 195, 169]

Here I get four bytes instead of the first two which I wanted.

data.bytes.to_a[0,2] # =3D> [195, 169]

So having said that I must ensure that the binary data I am working
with is encoded using ASCII-8BIT. This is off course completely
reasonable and recommended.

Anyway, my real comment is that it might be nice to have String#getbyte
or String#bytes be able to get a subset of bytes from the string.
For example:

"abcde".bytes(1,3).to_a # =3D> [98, 99, 100]
"abcde".bytes(1..3).to_a # =3D> [98, 99, 100]
"abcde".bytes(-2,2).to_a # =3D> [100, 101]

"abcde".getbyte(1,3) # =3D> [98, 99, 100]
"abcde".getbyte(1..3) # =3D> [98, 99, 100]
"abcde".getbyte(-2,2) # =3D> [100, 101]

String#getbyte is singular as opposed to plurar which doesn't sit
well with me.

Thanks for reading!

 - Emiel van de Laar