Issue #5684 has been updated by Yui NARUSE.


Vladimir Chernis wrote:
> Yui NARUSE wrote:
> > You can set encodings to a Socket object with Socket#set_encoding.
> 
> I understand, but if I don't call Socket#set_encoding, shouldn't the encoding fall back to the default encoding specified by the -E option to ruby?

Socket doesn't respect default_external because default_external is set from the locale of the client system,
but the encoding of the input string from sockets is depend on the server software.
Moreover data from socket is usually binary.

> > But Socket#recv is an binary API like IO#read(n)
> > You can use textual API IO#read and get ISO-8859-1 string.
> 
> Is IO#read the same as Socket#read? Because changing `recv` to `read` in client.rb doesn't change anything about the encoding.
> 
> I know File#read respects the default encoding. It would be nice if Socket#read did the same thing, especially since Net::HTTP uses Socket.

File and Socket are different.
Note that Net::HTTP's policy is independent from Socket.

> Am I mistaken to expect this behavior?

The conclusion is, Yes.
----------------------------------------
Bug #5684: [[Ruby 1.9:]] Socket doesn't respect default external encoding
http://redmine.ruby-lang.org/issues/5684

Author: Vladimir Chernis
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 
ruby -v: ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin11.2.0] 


When receiving data from a TCPSocket (as in client.rb, attached), the default internal encoding specified by the -E option to ruby is not respected.

Steps:
(1) In terminal window A, run: ruby server.rb
(2) In terminal window B, run: ruby -E ISO-8859-1 client.rb

Expected result for terminal window B:
bytes: "hell\xF6"
encoding: ISO-8859-1

Actual result for terminal window B:
bytes: "hell\xF6"
encoding: ASCII-8BIT

Workaround:
Use String#force_encoding('ISO-8859-1')



-- 
http://redmine.ruby-lang.org