Kless wrote:
> I need store raw strings as this one:
> "V-\243\230mJ\262.\031\023-4\301\324\241Y"
> and I would to know if there will any problem with Ruby 1.9

The answer is, "that depends": Ruby 1.9's string handling is extremely 
complicated.

* If the string is a literal within the program source, then adding a 
comment

# encoding: ASCII-8BIT

as the very first line of your program (or the second line if you have a 
shebang line) will make literals have this encoding by default. Having 
said that, strings with backslash-escapes like that will probably get 
ASCII-8BIT by default.

* If the string comes from reading a file, then you need to open it in 
binary mode: File.open("xxx","rb") { |f| ... }

* If the string comes from reading from a socket, then I believe it will 
be ASCII-8BIT by default

* If the string comes from reading STDIN, then you will have to be very 
careful; for safety you need something like

  STDIN.set_encoding "ASCII-8BIT"

Your program may or may not work without these changes, because Ruby 
1.9's behaviour at runtime depends on settings in your environment. That 
is, the same program with the same data might work on one computer but 
crash on another computer. Using the above incantations is your first 
line of defense against this stupidity.

Then you need to be sure that every single method that you call in other 
people's libraries, which takes string arguments or returns string 
values, behaves in the way you want. For example, if you call 
Library.foo and it returns a string whose encoding is UTF-8 and contains 
characters with the high bit set, and you try to concatenate it with one 
of your own binary strings, the program will crash.

Here's a somewhat contrived example:

-------- main.rb (your program) --------
# encoding: ASCII-8BIT

require 'library'
binary_data = "\xff\xee\xdd"
msg = Library.err_to_str
binary_data << [msg.bytesize].pack("N")
binary_data << msg

-------- library.rb (someone else's code that you don't control) 
--------
# encoding: UTF-8

module Library
  def self.err_to_str
    "über-error"
  end
end

$ ruby19 main.rb
main.rb:7:in `<main>': incompatible character encodings: ASCII-8BIT and 
UTF-8 (Encoding::CompatibilityError)

Your only way to protect against this is to force encodings at every 
point where two strings of differing provenance might encounter each 
other. e.g.

msg = Library.err_to_str
binary_data << [msg.bytesize].pack("N")
msg.force_encoding "ASCII-8BIT"
binary_data << msg

Beware also that ruby 1.9's documentation is often either missing or 
misleading when it comes to character encodings. For example, ri19 
Array#pack says:

      Directive    Meaning
      ---------------------------------------------------------------
          @     |  Moves to absolute position
          A     |  arbitrary binary string (space padded, count is 
width)
          a     |  arbitrary binary string (null padded, count is width)

So you might expect that an arbitrary String can be packed using a*:

# encoding: ASCII-8BIT

require 'library'
binary_data = "\xff\xee\xdd"
msg = Library.err_to_str
binary_data << [msg.bytesize,msg].pack("Na*")    # CRASH
puts binary_data.inspect

No, you still need a msg.force_encoding "ASCII-8BIT" before the pack.

If all this scares you - and it does me - then remember that staying 
with ruby 1.8 is a reasonable alternative. Ruby 1.8.6 is going to be 
maintained for a long time going forward, thanks to the people at 
EngineYard and Phusion Passenger.

HTH,

Brian.
-- 
Posted via http://www.ruby-forum.com/.