On Fri, 31 Oct 2008 21:24:34 +1100, Martin Duerst <duerst / it.aoyama.ac.jp> wrote: > Well, we could make some simple scripts simpler, but only at the > expense of making bigger scripts much more brittle. I am hoping that there is a way of achieving the simplicity without increasing brittleness. Maybe my original suggestion doesn't achieve this 100%, but I was am still am hoping that by discussing and thinking about the issues that there may be a way to achieve it or at least a better balance than there is at the moment. > In my opinion, > once you use \x string escapes or pack, you have to know about the > distinction between bytes and characters, and should be able to > add the necessary force-encoding (or whatever else is needed). Yes, the programmer would probably have the technical ability to do this, but that isn't really the point. The main points to me are that existing programs may break, and the extra ugly syntax that is required. Please remember that if a user upgrades ruby to 1.9 and find that existing scripts stop working, he or she may NOT be a programmer and may not easily be able to diagnose & fix it. > Well, I think there is a problem in pack. It has so many different > template characters that it's impossible in general to say what > encoding the result should be. Matz did some followup work on > your proposal at revision 20057 I made a suggestion (redmine 640) a while ago for an Array#pack_encoding method which I think would help, but this hasn't been acted upon. >> Pack is one simple example of a bunch of methods that return strings, >> but >> cannot easily determine what encoding to return them in. > > I'd guess pack is one of the more complex ones. If you know others, > please tell us, I think nobody is claiming that all i's are dotted > and all t's crossed in this area. I just meant that pack is a well known, built-in method so is a "simple example" to show the issue. I wasn't refering to what it does or how messy it is. However there are several library methods (both in the standard library and in other places, including my own libraries) which return strings but may not know the encoding of their output. For these, they cannot simply assume that the encoding should be "default_internal" or "default_external". >> This is *forcing* the programmer to use "force_encoding()" > > Or whatever else is appropriate. > >> where in 1.8 it >> was not necessary, and in 1.9 it can seem rather annoying. > > It can seem annoying until you realize that it's necessary. It's necessary because currently Ruby 1.9 is making it so. We may have the opportunity to make it better. Cheers Mike