------------32fHXckHKr4KOkSKFCDLAF
Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8
Content-Transfer-Encoding: 8bit

Hi,

I decided to put in a couple of hours to see if I could do a quick patch,  
and the result is attached.

The patch makes "ASCII-8BIT" like a "wild-card" encoding - compatible with  
any other encoding as long as it is valid if forced to that encoding. I  
added a warning that is displayed (with -w) when this happens.

I know I seem to be in the minority with this issue, but I think there are  
a lot of benefits:

- Less need to worry about the encoding of output strings from libraries  
and methods like "Array#pack"
- Less need to worry how "\xNN" string literals are handled
- In most cases simple, encoding-unaware scripts should work without the  
need for "force_encoding"
- Better compatibility with 1.8 scripts

As there seems to be a good likelihood that the patch will be rejected, I  
haven't tested it thoroughly, so I may have missed some things, in  
particular with REGEXPs.
Also I have not attempted to separate ASCII-8BIT & BINARY so that strings  
which are forced to BINARY cannot be converted. This may be a good idea.

Examples with the patch applied:

RUBYOPT irb
/usr/local/lib/ruby/1.9.0/irb/context.rb:166: warning: method redefined;  
discarding old irb_name
irb(main):001:0> su  abc\u0639"
"abcع"
irb(main):002:0> sn  abc\xD8\xB9"
"abc\xD8\xB9"
irb(main):003:0> su + sn
(irb):3: warning: Assuming ASCII-8BIT string is UTF-8
"abcعabcع"
irb(main):004:0> sn + su
(irb):4: warning: Assuming ASCII-8BIT string is UTF-8
"abcعabcع"
irb(main):005:0> sn su
true
irb(main):006:0> su << sn
(irb):6: warning: Assuming ASCII-8BIT string is UTF-8
"abcعabcع"
irb(main):007:0> sn << su
"abc\xD8\xB9abc\xD8\xB9abc\xD8\xB9"

I am happy to put in more effort into this if I get positive feedback.
I think it is important because without something like this, there could  
be justifiable criticisms of the need for "force_encoding" and of poor  
backward compatibility with 1.8.

Cheers
Mike.


On Fri, 31 Oct 2008 15:42:55 +1100, Nobuyoshi Nakada <nobu / ruby-lang.org>  
wrote:

> Hi,
>
> At Fri, 31 Oct 2008 07:14:21 +0900,
> Michael Selig wrote in [ruby-core:19646]:
>> Feature #695 was closed & marked done, but unfortunately it does not  
>> seem
>> to have been implemented :-(
>
> Martin kindly replied already, so I don't have to add his post
> so much.
>
>> If you agree that this is a good idea, I don't mind trying to produce a
>> patch for it myself. Please let me know.
>
> I don't agree, but feel free to post your patch, of course.
>
------------32fHXckHKr4KOkSKFCDLAF
Content-Disposition: attachment; filename=ascii-8bit.pat
Content-Type: application/octet-stream; name=ascii-8bit.pat
Content-Transfer-Encoding: Base64

SW5kZXg6IGVuY29kaW5nLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gZW5jb2Rp
bmcuYwkocmV2aXNpb24gMjAxMTIpCisrKyBlbmNvZGluZy5jCSh3b3JraW5nIGNv
cHkpCkBAIC02MTksNiArNjE5LDE2IEBACiAJcmJfcmFpc2UocmJfZUVuY0NvbXBh
dEVycm9yLCAiaW5jb21wYXRpYmxlIGNoYXJhY3RlciBlbmNvZGluZ3M6ICVzIGFu
ZCAlcyIsCiAJCSByYl9lbmNfbmFtZShyYl9lbmNfZ2V0KHN0cjEpKSwKIAkJIHJi
X2VuY19uYW1lKHJiX2VuY19nZXQoc3RyMikpKTsKKyAgICBlbHNlIGlmIChSVEVT
VChydWJ5X3ZlcmJvc2UpKSB7CisJaW50IGlkeDEgPSByYl9lbmNfZ2V0X2luZGV4
KHN0cjEpOworCWludCBpZHgyID0gcmJfZW5jX2dldF9pbmRleChzdHIyKTsKKwor
CWlmIChpZHgxICE9IGlkeDIgJiYgKGlkeDEgPT0gRU5DSU5ERVhfQVNDSUkgfHwg
aWR4MiA9PSBFTkNJTkRFWF9BU0NJSSkKKwkgICAgJiYgKHJiX2VuY19zdHJfY29k
ZXJhbmdlKHN0cjEpICE9IEVOQ19DT0RFUkFOR0VfN0JJVCB8fAorCQlyYl9lbmNf
c3RyX2NvZGVyYW5nZShzdHIyKSAhPSBFTkNfQ09ERVJBTkdFXzdCSVQpKQorCSAg
ICByYl93YXJuaW5nKCJBc3N1bWluZyBBU0NJSS04QklUIHN0cmluZyBpcyAlcyIs
CisJCQlyYl9lbmNfbmFtZShyYl9lbmNfZ2V0KGlkeDIgPT0gRU5DSU5ERVhfQVND
SUkgPyBzdHIxIDogc3RyMikpKTsKKyAgICB9CiAgICAgcmV0dXJuIGVuYzsKIH0K
IApAQCAtNjgwLDYgKzY5MCwxMyBAQAogCX0KIAlpZiAoY3IxID09IEVOQ19DT0RF
UkFOR0VfN0JJVCkKIAkgICAgcmV0dXJuIGVuYzI7CisKKwlpZiAoaWR4MSA9PSBF
TkNJTkRFWF9BU0NJSSAmJgorCQlyYl9lbmNfc3RyX3ZhbGlkX2VuY29kaW5nKHN0
cjEsIGVuYzIpID09IFF0cnVlKQorCSAgICByZXR1cm4gZW5jMjsKKwlpZiAoaWR4
MiA9PSBFTkNJTkRFWF9BU0NJSSAmJiBCVUlMVElOX1RZUEUoc3RyMikgPT0gVF9T
VFJJTkcgJiYKKwkJcmJfZW5jX3N0cl92YWxpZF9lbmNvZGluZyhzdHIyLCBlbmMx
KSA9PSBRdHJ1ZSkKKwkgICAgcmV0dXJuIGVuYzE7CiAgICAgfQogICAgIHJldHVy
biAwOwogfQpJbmRleDogc3RyaW5nLmMKPT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0g
c3RyaW5nLmMJKHJldmlzaW9uIDIwMTEyKQorKysgc3RyaW5nLmMJKHdvcmtpbmcg
Y29weSkKQEAgLTMyNSw2ICszMjUsMTMgQEAKICAgICByZXR1cm4gY3I7CiB9CiAK
K1ZBTFVFCityYl9lbmNfc3RyX3ZhbGlkX2VuY29kaW5nKFZBTFVFIHN0ciwgcmJf
ZW5jb2RpbmcgKmVuYykKK3sKKyAgICBpbnQgY3IgPSBjb2RlcmFuZ2Vfc2NhbihS
U1RSSU5HX1BUUihzdHIpLCBSU1RSSU5HX0xFTihzdHIpLCBlbmMpOworICAgIHJl
dHVybiBjciA9PSBFTkNfQ09ERVJBTkdFX0JST0tFTiA/IFFmYWxzZSA6IFF0cnVl
OworfQorCiBpbnQKIHJiX2VuY19zdHJfYXNjaWlvbmx5X3AoVkFMVUUgc3RyKQog
ewpAQCAtMTcxNSwxMCArMTcyMiwxNyBAQAogICAgIGlmIChzdHJfZW5jaW5kZXgg
IT0gcHRyX2VuY2luZGV4ICYmCiAgICAgICAgIHN0cl9jciAhPSBFTkNfQ09ERVJB
TkdFXzdCSVQgJiYKICAgICAgICAgcHRyX2NyICE9IEVOQ19DT0RFUkFOR0VfN0JJ
VCkgeworCS8qIFRyZWF0IEFTQ0lJLThCSVQgc3BlY2lhbGx5ICovCisJaWYgKHB0
cl9hOCAmJiBjb2RlcmFuZ2Vfc2NhbihwdHIsIGxlbiwgcmJfZW5jX2Zyb21faW5k
ZXgoc3RyX2VuY2luZGV4KSkgIT0gRU5DX0NPREVSQU5HRV9CUk9LRU4pIHsKKwkg
ICAgcmJfd2FybmluZygiQXNzdW1pbmcgQVNDSUktOEJJVCBzdHJpbmcgaXMgJXMi
LAorCQkJcmJfZW5jX25hbWUocmJfZW5jX2Zyb21faW5kZXgoc3RyX2VuY2luZGV4
KSkpOworCX0KKwllbHNlIGlmICghc3RyX2E4KSB7CiAgICAgICBpbmNvbXBhdGli
bGU6Ci0gICAgICAgIHJiX3JhaXNlKHJiX2VFbmNDb21wYXRFcnJvciwgImluY29t
cGF0aWJsZSBjaGFyYWN0ZXIgZW5jb2RpbmdzOiAlcyBhbmQgJXMiLAotICAgICAg
ICAgICAgcmJfZW5jX25hbWUocmJfZW5jX2Zyb21faW5kZXgoc3RyX2VuY2luZGV4
KSksCi0gICAgICAgICAgICByYl9lbmNfbmFtZShyYl9lbmNfZnJvbV9pbmRleChw
dHJfZW5jaW5kZXgpKSk7CisJICAgIHJiX3JhaXNlKHJiX2VFbmNDb21wYXRFcnJv
ciwgImluY29tcGF0aWJsZSBjaGFyYWN0ZXIgZW5jb2RpbmdzOiAlcyBhbmQgJXMi
LAorCQlyYl9lbmNfbmFtZShyYl9lbmNfZnJvbV9pbmRleChzdHJfZW5jaW5kZXgp
KSwKKwkJcmJfZW5jX25hbWUocmJfZW5jX2Zyb21faW5kZXgocHRyX2VuY2luZGV4
KSkpOworCX0KICAgICB9CiAKICAgICBpZiAoc3RyX2NyID09IEVOQ19DT0RFUkFO
R0VfVU5LTk9XTikgewpAQCAtMjA0OCw2ICsyMDYyLDggQEAKICAgICBpZHgxID0g
RU5DT0RJTkdfR0VUKHN0cjEpOwogICAgIGlkeDIgPSBFTkNPRElOR19HRVQoc3Ry
Mik7CiAgICAgaWYgKGlkeDEgPT0gaWR4MikgcmV0dXJuIFF0cnVlOworICAgIC8q
IEFsbG93IGNvbXBhcmlzb25zIGJldHdlZW4gQVNDSUktOEJJVCAmIG90aGVyIGVu
Y29kaW5ncyAqLworICAgIGlmIChpZHgxID09IDAgfHwgaWR4MiA9PSAwKSByZXR1
cm4gUXRydWU7CiAgICAgcmMxID0gcmJfZW5jX3N0cl9jb2RlcmFuZ2Uoc3RyMSk7
CiAgICAgcmMyID0gcmJfZW5jX3N0cl9jb2RlcmFuZ2Uoc3RyMik7CiAgICAgaWYg
KHJjMSA9PSBFTkNfQ09ERVJBTkdFXzdCSVQpIHsK

------------32fHXckHKr4KOkSKFCDLAF--