-- _645126031 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit At 12:25 08/09/22, Michael Selig wrote: >On Mon, 22 Sep 2008 12:35:49 +1000, Martin Duerst <duerst / it.aoyama.ac.jp> >wrote: > >> >> Therefore, I think we should seriously consider this proposal, >> and hopefully implement it before Sept. 25th. In terms of >> implementation, I don't think it should be that difficult, >> but it may be quite a bit of work to check >> Encoding::default_internal in all the affected methods. > >Wow, that is rather ambitious - 3 days? Well, that's the deadline for feature changes for 1.9.1. It would be a real pity to wait for 2.0 for this. The feature freeze wiki at http://redmine.ruby-lang.org/wiki/ruby/DevelopersMeeting20080922 says that default_internal is currently pending, but that this should be discussed/settled this week. Anyhow, I had a look at the code, and it doesn't seem to be that difficult. The function io_extract_encoding_option in io.c seems to be central. I'm attaching a patch, which I hope is a good start. I'm also writing to ruby-dev (in Japanese) because that's where the real experts are. The patch isn't as strict as your proposal with respect to re-setting, but I'm fine either way. I have tested this patch with code like the following (called with -Eutf-8, -Eshift_jis, -Eeuc-jp, and without -E option, in all combinations) >>>> Encoding.default_internal utf-8' # tested with 'utf-8', 'shift_jis', and 'euc-jp' s \u3042\u3044\u3046\u3048\u304A" File.open('testout1.txt', 'w:shift_jis') do |f| f.write s end File.open('testout2.txt', 'w:euc-jp') do |f| f.write s end File.open('testout3.txt', 'w:utf-8') do |f| f.write s end File.open('testout1.txt', 'r:shift_jis') do |f| s .read; p s.encoding end File.open('testout2.txt', 'r:euc-jp') do |f| s .read; p s.encoding end File.open('testout3.txt', 'r:utf-8') do |f| s .read; p s.encoding end File.open('testout3.txt', 'r:ASCII-8BIT') do |f| s .read; p s.encoding end # for next line, change file number to pick up default_internal File.open('testout3.txt', 'r') do |f| s .read; p s.encoding end >>>> >The bulk of the implementation will be in the libraries, and I think many >of them need updating to cope with non-acsii encodings anyhow. Yes. I'm not sure how libraries are affected by the feature freeze, but they have to be fixed anyhow, completely independently of default_internal. And I agree that this cannot be done in 3 days. Regards, Martin. >> - We should think through various scenarios for output. >> I can't think of any problems just now, I just noticed >> the absence of considerations for output below. > >I did think about output to a certain extent, and one good thing is that >IO already seems to automatically transcode to the "external" encoding at >the moment. As for other classes, again I think most need updating to >support multiple encodings anyhow. They will at a minimum need a way of >having the user pass the "external" encoding (defaulting to >"default_external"), and do the transcode as necessary, based on the >encoding of the data to be output. However, as with IO, this behaviour >probably should happen no matter whether "default_internal" is implemented >or not. > >Cheers >Mike > #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst / it.aoyama.ac.jp -- _645126031 Content-Type: application/octet-stream; name="patch_default_internal.txt"; x-mac-type4455854"; x-mac-creator4747874" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="patch_default_internal.txt" SW5kZXg6IGVuY29kaW5nLmMNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCi0tLSBlbmNvZGluZy5jCShyZXZpc2lvbiAx OTUxMCkNCisrKyBlbmNvZGluZy5jCSh3b3JraW5nIGNvcHkpDQpAQCAtMTA2Miw2ICsxMDYyLDY3 IEBADQogI2VuZGlmDQogfQ0KIA0KK3N0YXRpYyBpbnQgZGVmYXVsdF9pbnRlcm5hbF9pbmRleCA9 IC0xOw0KK3N0YXRpYyByYl9lbmNvZGluZyAqZGVmYXVsdF9pbnRlcm5hbCA9IDA7DQorDQorDQor cmJfZW5jb2RpbmcgKg0KK3JiX2RlZmF1bHRfaW50ZXJuYWxfZW5jb2Rpbmcodm9pZCkNCit7DQor ICAgIHJldHVybiBkZWZhdWx0X2ludGVybmFsOw0KK30NCisNCitWQUxVRQ0KK3JiX2VuY19kZWZh dWx0X2ludGVybmFsKHZvaWQpDQorew0KKyAgICByZXR1cm4gZGVmYXVsdF9pbnRlcm5hbD09MCA/ IFFuaWwgOiANCisJcmJfZW5jX2Zyb21fZW5jb2RpbmcoZGVmYXVsdF9pbnRlcm5hbCk7DQorfQ0K Kw0KKy8qDQorICogY2FsbC1zZXE6DQorICogICBFbmNvZGluZy5kZWZhdWx0X2ludGVybmFsID0+ IGVuYw0KKyAqDQorICogUmV0dXJucyBkZWZhdWx0IGludGVybmFsIGVuY29kaW5nIChuaWwgaWYg dW51c2VkKS4NCisgKg0KKyAqLw0KK3N0YXRpYyBWQUxVRQ0KK2dldF9kZWZhdWx0X2ludGVybmFs KFZBTFVFIGtsYXNzKQ0KK3sNCisgICAgcmV0dXJuIHJiX2VuY19kZWZhdWx0X2ludGVybmFsKCk7 DQorfQ0KKw0KK3ZvaWQNCityYl9lbmNfc2V0X2RlZmF1bHRfaW50ZXJuYWwoVkFMVUUgZW5jb2Rp bmcpDQorew0KKyAgICBpZiAoZGVmYXVsdF9pbnRlcm5hbCkNCisJcmJfd2FybigiUmVzZXR0aW5n IEVuY29kaW5nLmRlZmF1bHRfaW50ZXJuYWwiKTsNCisgICAgaWYgKGVuY29kaW5nID09IFFuaWwp IHsNCisgICAgICAgIGRlZmF1bHRfaW50ZXJuYWwgPSAwOw0KKyAgICAgICAgZGVmYXVsdF9pbnRl cm5hbF9pbmRleCA9IC0xOw0KKyAgICB9DQorICAgIGVsc2Ugew0KKwlkZWZhdWx0X2ludGVybmFs ID0gcmJfdG9fZW5jb2RpbmcoZW5jb2RpbmcpOw0KKwlkZWZhdWx0X2ludGVybmFsX2luZGV4ID0g cmJfZW5jX3RvX2luZGV4KGRlZmF1bHRfaW50ZXJuYWwpOw0KKyAgICB9DQorfQ0KKw0KKy8qDQor ICogY2FsbC1zZXE6DQorICogICBFbmNvZGluZy5kZWZhdWx0X2ludGVybmFsPSBlbmMgPT4gZW5j DQorICoNCisgKiBTZXRzIGRlZmF1bHQgaW50ZXJuYWwgZW5jb2RpbmcgKGRlZmF1bHQgaXMgbmls LCBpLmUuIHVudXNlZCkuDQorICogRm9yIHVzZSBpbiBtYWluIGFwcGxpY2F0aW9uOyBuZXZlciB1 c2UgaW4gYSBsaWJyYXJ5IQ0KKyAqIFJldHVybnMgbmlsLiBQcm9kdWNlcyBhIHdhcm5pbmcgaWYg cmVzZXQuDQorICoNCisgKi8NCitzdGF0aWMgVkFMVUUNCitzZXRfZGVmYXVsdF9pbnRlcm5hbChW QUxVRSBrbGFzcywgVkFMVUUgZW5jb2RpbmcpDQorew0KKyAgICByYl9lbmNfc2V0X2RlZmF1bHRf aW50ZXJuYWwoZW5jb2RpbmcpOw0KKyAgICByZXR1cm4gUW5pbDsNCit9DQorDQogc3RhdGljIHZv aWQNCiBzZXRfZW5jb2RpbmdfY29uc3QoY29uc3QgY2hhciAqbmFtZSwgcmJfZW5jb2RpbmcgKmVu YykNCiB7DQpAQCAtMTIxNCw2ICsxMjc1LDkgQEANCiAgICAgcmJfZGVmaW5lX3NpbmdsZXRvbl9t ZXRob2QocmJfY0VuY29kaW5nLCAiZGVmYXVsdF9leHRlcm5hbCIsIGdldF9kZWZhdWx0X2V4dGVy bmFsLCAwKTsNCiAgICAgcmJfZGVmaW5lX3NpbmdsZXRvbl9tZXRob2QocmJfY0VuY29kaW5nLCAi bG9jYWxlX2NoYXJtYXAiLCByYl9sb2NhbGVfY2hhcm1hcCwgMCk7DQogDQorICAgIHJiX2RlZmlu ZV9zaW5nbGV0b25fbWV0aG9kKHJiX2NFbmNvZGluZywgImRlZmF1bHRfaW50ZXJuYWwiLCAgIGdl dF9kZWZhdWx0X2ludGVybmFsLCAwKTsNCisgICAgcmJfZGVmaW5lX3NpbmdsZXRvbl9tZXRob2Qo cmJfY0VuY29kaW5nLCAiZGVmYXVsdF9pbnRlcm5hbD0iLCAgc2V0X2RlZmF1bHRfaW50ZXJuYWws IDEpOw0KKw0KICAgICBsaXN0ID0gcmJfYXJ5X25ldzIoZW5jX3RhYmxlLmNvdW50KTsNCiAgICAg UkJBU0lDKGxpc3QpLT5rbGFzcyA9IDA7DQogICAgIHJiX2VuY29kaW5nX2xpc3QgPSBsaXN0Ow0K SW5kZXg6IGlvLmMNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT0NCi0tLSBpby5jCShyZXZpc2lvbiAxOTUxMCkNCisrKyBp by5jCSh3b3JraW5nIGNvcHkpDQpAQCAtMzg4NSw2ICszODg1LDcgQEANCiAgICAgVkFMVUUgZWNv cHRzOw0KICAgICBpbnQgaGFzX2VuYyA9IDAsIGhhc192bW9kZSA9IDA7DQogICAgIFZBTFVFIGlu dG1vZGU7DQorICAgIHJiX2VuY29kaW5nICpkZWZfaW50ZXJuYWw7DQogDQogICAgIHZtb2RlID0g KnZtb2RlX3A7DQogDQpAQCAtMzk3Miw2ICszOTczLDIwIEBADQogDQogICAgICpvZmxhZ3NfcCA9 IG9mbGFnczsNCiAgICAgKmZtb2RlX3AgPSBmbW9kZTsNCisgICAgaWYgKGZtb2RlJkZNT0RFX1JF QURBQkxFICYmICFlbmMyICYmIChkZWZfaW50ZXJuYWw9cmJfZGVmYXVsdF9pbnRlcm5hbF9lbmNv ZGluZygpKSkgew0KKwlyYl9lbmNvZGluZyAqZGVmX2V4dGVybmFsID0gcmJfZGVmYXVsdF9leHRl cm5hbF9lbmNvZGluZygpOw0KKwlyYl9lbmNvZGluZyAqYXNjaWlfOGJpdCA9IHJiX2VuY19maW5k KCJBU0NJSS04QklUIik7DQorCWlmICghZW5jKSB7DQorCSAgICBpZiAoZGVmX2V4dGVybmFsIT1k ZWZfaW50ZXJuYWwgJiYgZGVmX2V4dGVybmFsIT1hc2NpaV84Yml0KSB7DQorCSAgICAgICAgZW5j ICA9IGRlZl9pbnRlcm5hbDsNCisJICAgICAgICBlbmMyID0gZGVmX2V4dGVybmFsOw0KKwkgICAg fQ0KKwl9DQorCWVsc2UgaWYgKGVuYyE9ZGVmX2ludGVybmFsICYmIGVuYyE9YXNjaWlfOGJpdCkg ew0KKwkgICAgZW5jMiA9IGVuYzsNCisJICAgIGVuYyA9IGRlZl9pbnRlcm5hbDsNCisJfQ0KKyAg ICB9DQogICAgIGNvbnZjb25maWdfcC0+ZW5jID0gZW5jOw0KICAgICBjb252Y29uZmlnX3AtPmVu YzIgPSBlbmMyOw0KICAgICBjb252Y29uZmlnX3AtPmVjZmxhZ3MgPSBlY2ZsYWdzOw0K -- _645126031 --