On Fri, 31 Oct 2008 21:24:34 +1100, Martin Duerst <duerst / it.aoyama.ac.jp>  
wrote:

> Well, we could make some simple scripts simpler, but only at the
> expense of making bigger scripts much more brittle.

I am hoping that there is a way of achieving the simplicity without  
increasing brittleness.
Maybe my original suggestion doesn't achieve this 100%, but I was am still  
am hoping that by discussing and thinking about the issues that there may  
be a way to achieve it or at least a better balance than there is at the  
moment.

> In my opinion,
> once you use \x string escapes or pack, you have to know about the
> distinction between bytes and characters, and should be able to
> add the necessary force-encoding (or whatever else is needed).

Yes, the programmer would probably have the technical ability to do this,  
but that isn't really the point.
The main points to me are that existing programs may break, and the extra  
ugly syntax that is required.
Please remember that if a user upgrades ruby to 1.9 and find that existing  
scripts stop working, he or she may NOT be a programmer and may not easily  
be able to diagnose & fix it.


> Well, I think there is a problem in pack. It has so many different
> template characters that it's impossible in general to say what
> encoding the result should be. Matz did some followup work on
> your proposal at revision 20057

I made a suggestion (redmine 640) a while ago for an Array#pack_encoding  
method which I think would help, but this hasn't been acted upon.

>> Pack is one simple example of a bunch of methods that return strings,  
>> but
>> cannot easily determine what encoding to return them in.
>
> I'd guess pack is one of the more complex ones. If you know others,
> please tell us, I think nobody is claiming that all i's are dotted
> and all t's crossed in this area.

I just meant that pack is a well known, built-in method so is a "simple  
example" to show the issue. I wasn't refering to what it does or how messy  
it is. However there are several library methods (both in the standard  
library and in other places, including my own libraries) which return  
strings but may not know the encoding of their output. For these, they  
cannot simply assume that the encoding should be "default_internal" or  
"default_external".

>> This is *forcing* the programmer to use "force_encoding()"
>
> Or whatever else is appropriate.
>
>> where in 1.8 it
>> was not necessary, and in 1.9 it can seem rather annoying.
>
> It can seem annoying until you realize that it's necessary.

It's necessary because currently Ruby 1.9 is making it so. We may have the  
opportunity to make it better.

Cheers
Mike