On 26.6.2006, at 21:46, Austin Ziegler wrote: > On 6/26/06, Izidor Jerebic <ij.rubylist / gmail.com> wrote: >> On 26.6.2006, at 20:37, Michal Suchanek wrote: >>> If you consider s3 = File.open('legacy.txt','rb',:iso885915) { |f| >>> f.read } without autoconversion you would have to immediately do >>> s3.recode :utf8 otherwise s1 + s3 would not work. >> Yes. This shows that if there is no autoconversion, programmer will >> always need to recode to a common app encoding if the aplication is >> to work without problems. And if we always need to recode strings >> which we receive from third-part classes/libraries, encoding handling >> will either consume half of the program lines or people won't do it >> and programs will be full of errors. As can be seen from experience >> of other languages (and Ruby), the second option will prevail and we >> will be in a mess not much better than today. > > I doubt this is in the least bit true. > I'm saying that your cure is far worse than disease. Basically, I am just advocating to get autoconversion into "official" proposal. I am not proposing unicode. But if there is no autoconversion, unicode is better. This claim is supposed to get support for autoconversion :-) BTW, you may have no problems at all. We, on the other hand, have lots of problems (in Ruby and other languages) which can be traced to exactly this hope of "all programmers will be doing lots of manual work to make things safe for others". You are deluded. In environments which already have this cure (internal unicode), there are no such enormous problems as we experience in those without this cure. So sucessess and failures I describe are based on real experience. Unlike your claims, which are just opinions. I am not saying that unicode encoding is the ideal solution. But it turned out to be quite good one, and for sure much better than manual checking/changing of encoding. > >> Therefore m17n without autconversion (as is current Matz's proposal) >> gains us almost nothing. If we have no autoconversion, my vote >> goes to >> Unicode internal encoding (because it implicitly handles >> autoconversion problems). > > So does the coersion proposal that I've made without locking ourselves > into Unicode. But that is your proposal (and mine and several others'), not Matz's. Current "official" proposal will make a mess. > > I couldn't make sense of your last paragraph. Well, tell me what exactly do I get when this code executes: result = File.open( "file ) { |f| f.read( 1000 ) } What is 'result' ? Binary string under all circumstances? Or maybe sometimes I get a String and sometimes I get a binary String? Which one under what circumstances? This is called error-prone code with undefined results. We have two equally good options: 1. If we change API and IO returns ByteArray, we have no confusion. 2. If we have clear and simple rules about IO returning Strings, we also have no confusion. Therefore, if there will be complex auto-magic String tagging with encoding, I prefer introducing ByteArray, because it will prevent errors. izidor