Hi,

In short, the comment was wrong.  O_BINARY only disables newline
conversion, does not change encoding of the output.  I recommend "b"
file mode, which is smarter.  Whether we should update PStore is
controversial.  The discussion should move to ruby-core.

							matz.

In message "Re: File::BINARY does not behave as advertised. How do I help to fix this?"
    on Tue, 13 Sep 2011 01:47:11 +0900, Cameron Pope <camerooni / gmail.com> writes:
|
|I noticed some anamolous behavior opening files with the file mode
|flags. If the default internal encoding is set, when using the file
|mode flags to open a file, the file's external encoding is set to
|something other than ASCII-8BIT, which can cause binary file
|operations (such as Marshal.dump) to blow up.
|
|Please forgive the long message, and let me know if it would be more
|appropriate to open some issues, but since I've never posted an Ruby
|issue before, I wanted to make sure I was not being naive and that I
|understand what is really going on.
|
|Here is a simple example of what I mean:
|
|  #/usr/bin/env ruby
|  Encoding.default_internal = 'UTF-8'
|
|  File.open('test',File::CREAT | File::RDWR | File::BINARY) do |f|
|    # This should be ASCII-8BIT, right? At least according to io.c, line 10792
|    puts "Integer Flags Encoding: #{f.external_encoding.to_s}"
|  end
|
|  File.open('test2','w+b') do |f|
|    # This actually is ASCII-8BIT
|    puts "String Mode Encoding: #{f.external_encoding.to_s}"
|  end
|
|And running it:
|
|  file-binary-test cpope$  ruby simple_file_test.rb
|  Integer Flags Encoding: UTF-8
|  String Mode Encoding: ASCII-8BIT
|
|I don't think that is the intended behavior. If I look at IO.c in the
|latest Ruby code snapshot:
|
|  --- io.c (last night's snapshot)
|  10792 #ifndef O_BINARY
|  10793 # define  O_BINARY 0
|  10794 #endif
|  10795     /* disable line code conversion and make ASCII-8BIT */
|  10796     rb_file_const("BINARY", INT2FIX(O_BINARY));
|
|As one can see above, first of all, File::BINARY will be zero in every
|case that I can suss out in the Ruby source code - there is nowhere in
|the 1.9.x codebase I can see that defines O_BINARY to be anything but
|zero, and as was empirically demonstrated above, opening a file with
|this constant will not set the encoding to ASCII-8BIT. What is really
|bad about this is when using the integer flags to open a file, there
|is not a good way to check if a developer intended for a it to be
|opened as a binary file. There is, of course, a way to manually
|specify the encoding for a file opened with the integer flags, which
|would be the right thing to do in the case above.
|
|So my first question is: How do we address this deficiency? I can't
|think of a better way than to document the 'catch' with using the
|integer flags in this case. I've noticed that many of the File
|constants aren't documented, so I'm happy to give it a shot if that's
|the best approach.
|
|But this brings us to another issue. There are some places in the Ruby
|standard library that depend on File::BINARY actually opening a file
|suitable for writing Binary data. For example, in PStore:
|
|At the top of lib/pstore.rb
|   96 class PStore
|   97   binmode = defined?(File::BINARY) ? File::BINARY : 0
|   98   RDWR_ACCESS = File::RDWR | File::CREAT | binmode
|   99   RD_ACCESS = File::RDONLY | binmode
|  100   WR_ACCESS = File::WRONLY | File::CREAT | File::TRUNC | binmode
|
|These flags are passed to the bottlenecks that open the data file for
|reading and writing. Because it is using the integer constants to
|define how the file is opened, it's not hard to make PStore blow up in
|the course or normal operation. To conserve space, I've put some
|sample code in this gist: https://gist.github.com/1211614
|
|So my second thought is that this is an issue with the PStore library,
|and that it would be appropriate to modify the file bottlenecks so
|they explicitly specify ASCII-8BIT as the file encoding. Is there any
|reason that I'm off target and I should not log that as an issue with
|a test and a patch?
|
|Apologies in advance if I am using the wrong forum or am totally
|off-base with my questions.
|
|Thank you for your time,
|Cameron