------------32fHXckHKr4KOkSKFCDLAF Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8 Content-Transfer-Encoding: 8bit Hi, I decided to put in a couple of hours to see if I could do a quick patch, and the result is attached. The patch makes "ASCII-8BIT" like a "wild-card" encoding - compatible with any other encoding as long as it is valid if forced to that encoding. I added a warning that is displayed (with -w) when this happens. I know I seem to be in the minority with this issue, but I think there are a lot of benefits: - Less need to worry about the encoding of output strings from libraries and methods like "Array#pack" - Less need to worry how "\xNN" string literals are handled - In most cases simple, encoding-unaware scripts should work without the need for "force_encoding" - Better compatibility with 1.8 scripts As there seems to be a good likelihood that the patch will be rejected, I haven't tested it thoroughly, so I may have missed some things, in particular with REGEXPs. Also I have not attempted to separate ASCII-8BIT & BINARY so that strings which are forced to BINARY cannot be converted. This may be a good idea. Examples with the patch applied: RUBYOPT irb /usr/local/lib/ruby/1.9.0/irb/context.rb:166: warning: method redefined; discarding old irb_name irb(main):001:0> su abc\u0639" "abcع" irb(main):002:0> sn abc\xD8\xB9" "abc\xD8\xB9" irb(main):003:0> su + sn (irb):3: warning: Assuming ASCII-8BIT string is UTF-8 "abcعabcع" irb(main):004:0> sn + su (irb):4: warning: Assuming ASCII-8BIT string is UTF-8 "abcعabcع" irb(main):005:0> sn su true irb(main):006:0> su << sn (irb):6: warning: Assuming ASCII-8BIT string is UTF-8 "abcعabcع" irb(main):007:0> sn << su "abc\xD8\xB9abc\xD8\xB9abc\xD8\xB9" I am happy to put in more effort into this if I get positive feedback. I think it is important because without something like this, there could be justifiable criticisms of the need for "force_encoding" and of poor backward compatibility with 1.8. Cheers Mike. On Fri, 31 Oct 2008 15:42:55 +1100, Nobuyoshi Nakada <nobu / ruby-lang.org> wrote: > Hi, > > At Fri, 31 Oct 2008 07:14:21 +0900, > Michael Selig wrote in [ruby-core:19646]: >> Feature #695 was closed & marked done, but unfortunately it does not >> seem >> to have been implemented :-( > > Martin kindly replied already, so I don't have to add his post > so much. > >> If you agree that this is a good idea, I don't mind trying to produce a >> patch for it myself. Please let me know. > > I don't agree, but feel free to post your patch, of course. > ------------32fHXckHKr4KOkSKFCDLAF Content-Disposition: attachment; filename=ascii-8bit.pat Content-Type: application/octet-stream; name=ascii-8bit.pat Content-Transfer-Encoding: Base64 SW5kZXg6IGVuY29kaW5nLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gZW5jb2Rp bmcuYwkocmV2aXNpb24gMjAxMTIpCisrKyBlbmNvZGluZy5jCSh3b3JraW5nIGNv cHkpCkBAIC02MTksNiArNjE5LDE2IEBACiAJcmJfcmFpc2UocmJfZUVuY0NvbXBh dEVycm9yLCAiaW5jb21wYXRpYmxlIGNoYXJhY3RlciBlbmNvZGluZ3M6ICVzIGFu ZCAlcyIsCiAJCSByYl9lbmNfbmFtZShyYl9lbmNfZ2V0KHN0cjEpKSwKIAkJIHJi X2VuY19uYW1lKHJiX2VuY19nZXQoc3RyMikpKTsKKyAgICBlbHNlIGlmIChSVEVT VChydWJ5X3ZlcmJvc2UpKSB7CisJaW50IGlkeDEgPSByYl9lbmNfZ2V0X2luZGV4 KHN0cjEpOworCWludCBpZHgyID0gcmJfZW5jX2dldF9pbmRleChzdHIyKTsKKwor CWlmIChpZHgxICE9IGlkeDIgJiYgKGlkeDEgPT0gRU5DSU5ERVhfQVNDSUkgfHwg aWR4MiA9PSBFTkNJTkRFWF9BU0NJSSkKKwkgICAgJiYgKHJiX2VuY19zdHJfY29k ZXJhbmdlKHN0cjEpICE9IEVOQ19DT0RFUkFOR0VfN0JJVCB8fAorCQlyYl9lbmNf c3RyX2NvZGVyYW5nZShzdHIyKSAhPSBFTkNfQ09ERVJBTkdFXzdCSVQpKQorCSAg ICByYl93YXJuaW5nKCJBc3N1bWluZyBBU0NJSS04QklUIHN0cmluZyBpcyAlcyIs CisJCQlyYl9lbmNfbmFtZShyYl9lbmNfZ2V0KGlkeDIgPT0gRU5DSU5ERVhfQVND SUkgPyBzdHIxIDogc3RyMikpKTsKKyAgICB9CiAgICAgcmV0dXJuIGVuYzsKIH0K IApAQCAtNjgwLDYgKzY5MCwxMyBAQAogCX0KIAlpZiAoY3IxID09IEVOQ19DT0RF UkFOR0VfN0JJVCkKIAkgICAgcmV0dXJuIGVuYzI7CisKKwlpZiAoaWR4MSA9PSBF TkNJTkRFWF9BU0NJSSAmJgorCQlyYl9lbmNfc3RyX3ZhbGlkX2VuY29kaW5nKHN0 cjEsIGVuYzIpID09IFF0cnVlKQorCSAgICByZXR1cm4gZW5jMjsKKwlpZiAoaWR4 MiA9PSBFTkNJTkRFWF9BU0NJSSAmJiBCVUlMVElOX1RZUEUoc3RyMikgPT0gVF9T VFJJTkcgJiYKKwkJcmJfZW5jX3N0cl92YWxpZF9lbmNvZGluZyhzdHIyLCBlbmMx KSA9PSBRdHJ1ZSkKKwkgICAgcmV0dXJuIGVuYzE7CiAgICAgfQogICAgIHJldHVy biAwOwogfQpJbmRleDogc3RyaW5nLmMKPT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0g c3RyaW5nLmMJKHJldmlzaW9uIDIwMTEyKQorKysgc3RyaW5nLmMJKHdvcmtpbmcg Y29weSkKQEAgLTMyNSw2ICszMjUsMTMgQEAKICAgICByZXR1cm4gY3I7CiB9CiAK K1ZBTFVFCityYl9lbmNfc3RyX3ZhbGlkX2VuY29kaW5nKFZBTFVFIHN0ciwgcmJf ZW5jb2RpbmcgKmVuYykKK3sKKyAgICBpbnQgY3IgPSBjb2RlcmFuZ2Vfc2NhbihS U1RSSU5HX1BUUihzdHIpLCBSU1RSSU5HX0xFTihzdHIpLCBlbmMpOworICAgIHJl dHVybiBjciA9PSBFTkNfQ09ERVJBTkdFX0JST0tFTiA/IFFmYWxzZSA6IFF0cnVl OworfQorCiBpbnQKIHJiX2VuY19zdHJfYXNjaWlvbmx5X3AoVkFMVUUgc3RyKQog ewpAQCAtMTcxNSwxMCArMTcyMiwxNyBAQAogICAgIGlmIChzdHJfZW5jaW5kZXgg IT0gcHRyX2VuY2luZGV4ICYmCiAgICAgICAgIHN0cl9jciAhPSBFTkNfQ09ERVJB TkdFXzdCSVQgJiYKICAgICAgICAgcHRyX2NyICE9IEVOQ19DT0RFUkFOR0VfN0JJ VCkgeworCS8qIFRyZWF0IEFTQ0lJLThCSVQgc3BlY2lhbGx5ICovCisJaWYgKHB0 cl9hOCAmJiBjb2RlcmFuZ2Vfc2NhbihwdHIsIGxlbiwgcmJfZW5jX2Zyb21faW5k ZXgoc3RyX2VuY2luZGV4KSkgIT0gRU5DX0NPREVSQU5HRV9CUk9LRU4pIHsKKwkg ICAgcmJfd2FybmluZygiQXNzdW1pbmcgQVNDSUktOEJJVCBzdHJpbmcgaXMgJXMi LAorCQkJcmJfZW5jX25hbWUocmJfZW5jX2Zyb21faW5kZXgoc3RyX2VuY2luZGV4 KSkpOworCX0KKwllbHNlIGlmICghc3RyX2E4KSB7CiAgICAgICBpbmNvbXBhdGli bGU6Ci0gICAgICAgIHJiX3JhaXNlKHJiX2VFbmNDb21wYXRFcnJvciwgImluY29t cGF0aWJsZSBjaGFyYWN0ZXIgZW5jb2RpbmdzOiAlcyBhbmQgJXMiLAotICAgICAg ICAgICAgcmJfZW5jX25hbWUocmJfZW5jX2Zyb21faW5kZXgoc3RyX2VuY2luZGV4 KSksCi0gICAgICAgICAgICByYl9lbmNfbmFtZShyYl9lbmNfZnJvbV9pbmRleChw dHJfZW5jaW5kZXgpKSk7CisJICAgIHJiX3JhaXNlKHJiX2VFbmNDb21wYXRFcnJv ciwgImluY29tcGF0aWJsZSBjaGFyYWN0ZXIgZW5jb2RpbmdzOiAlcyBhbmQgJXMi LAorCQlyYl9lbmNfbmFtZShyYl9lbmNfZnJvbV9pbmRleChzdHJfZW5jaW5kZXgp KSwKKwkJcmJfZW5jX25hbWUocmJfZW5jX2Zyb21faW5kZXgocHRyX2VuY2luZGV4 KSkpOworCX0KICAgICB9CiAKICAgICBpZiAoc3RyX2NyID09IEVOQ19DT0RFUkFO R0VfVU5LTk9XTikgewpAQCAtMjA0OCw2ICsyMDYyLDggQEAKICAgICBpZHgxID0g RU5DT0RJTkdfR0VUKHN0cjEpOwogICAgIGlkeDIgPSBFTkNPRElOR19HRVQoc3Ry Mik7CiAgICAgaWYgKGlkeDEgPT0gaWR4MikgcmV0dXJuIFF0cnVlOworICAgIC8q IEFsbG93IGNvbXBhcmlzb25zIGJldHdlZW4gQVNDSUktOEJJVCAmIG90aGVyIGVu Y29kaW5ncyAqLworICAgIGlmIChpZHgxID09IDAgfHwgaWR4MiA9PSAwKSByZXR1 cm4gUXRydWU7CiAgICAgcmMxID0gcmJfZW5jX3N0cl9jb2RlcmFuZ2Uoc3RyMSk7 CiAgICAgcmMyID0gcmJfZW5jX3N0cl9jb2RlcmFuZ2Uoc3RyMik7CiAgICAgaWYg KHJjMSA9PSBFTkNfQ09ERVJBTkdFXzdCSVQpIHsK ------------32fHXckHKr4KOkSKFCDLAF--