"Brian Candler" <B.Candler / pobox.com> wrote:


> On Thu, Nov 27, 2008 at 03:03:49AM +0900, Radosaw Buat wrote:
>> What about:
>> data.force_encoding("ASCII-8BIT")[1,3].bytes.to_a
>> ?
>
> But that changes the encoding of 'data' as a side-effect. To prevent that,
> you'd need
>
>  data.dup.force_encoding("ASCII-8BIT")[1,3].bytes.to_a
>
> which is getting a bit messy.

In retrospect it might have been nice to have String#force_encoding! doing 
what force_encoding now does, as well as a "duplicating" 
String#force_encoding, but I think it's way too late for that now.

> OTOH, I'm not sure how often you'd want to
> handle a string which has been tagged as UTF-8 in this way.

I think one of the problems here is that string literals containing \x are 
not always set to ASCII-8BIT, but to the source encoding, which may very 
well be UTF-8. This is an issue that I have been trying to highlight for a 
while.
I would much prefer string literals with "\x" to be always ASCII-8BIT.

Cheers
Mike