Issue #7200 has been updated by naruse (Yui NARUSE).


shyouhei (Shyouhei Urabe) wrote:
> So yui says this issue is illustrative because it was reported by Brian.  What a ...
> 
> I feel very sorry, Brian.  I can do nothing anymore.

Don't do FUD.

Brian said they are inconsistent even if mode_enc looks like encoding.
I showed the reason why it is: because they are different and they take different type of arguments.

If Brian is not satisfied the reason and has an better idea, he should show it with actual use case.
I thought Brian create this ticket with Rubinius/RubySpec interest, and it should be reasonable because they are no use case.
I criticize imagining his fictional desire and blaming me.

duerst (Martin Dürst) wrote:
> The current behavior in in part influenced by implementation. But there is also a conceptual issue, because "bom|" only applies at the start of the file, and may have different implications for input (check for a BOM) and output (add a BOM). So we have to carefully think what's the best way to make this easy for programmers to use the right way.

Mainly it is conceptual.
This BOM|UTF-* specifier has two main function:
* skip U+FEFF at the beginning of the file
* set the external encoding with seeing the BOM
Such behavior is considered a derivative of mode, and it is not encoding.
Because of it is not an encoding, they can't be used in the context of encodings.

See also http://bugs.ruby-lang.org/issues/1951 and related tickets.
----------------------------------------
Bug #7200: Setting external encoding with BOM|
https://bugs.ruby-lang.org/issues/7200#change-32616

Author: brixen (Brian Ford)
Status: Assigned
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: 
Target version: 2.0.0
ruby -v: ruby 1.9.3p286 (2012-10-12 revision 37165) [x86_64-darwin10.8.0]


File.open will accept, for example, :encoding => "bom|utf-16be:euc-jp" or :encoding => "bom|utf-16be". However, :external_encoding => "bom|utf-16be" raises an ArgumentError. Likewise, IO#set_encoding will accept "bom|utf-16be:euc-jp" but raises an ArgumentError if passed "bom|utf-16be", "euc-jp".

It is inconsistent to accept "bom|utf-*" in some cases and not others.

See the following IRB transcript.

$ irb
1.9.3p286 :001 > f = File.open "foo.txt", "r", :encoding => "bom|utf-16be:euc-jp"
 => #<File:foo.txt> 
1.9.3p286 :002 > f.internal_encoding
 => #<Encoding:EUC-JP> 
1.9.3p286 :003 > f.external_encoding
 => #<Encoding:UTF-16BE> 
1.9.3p286 :004 > f.close
 => nil 
1.9.3p286 :005 > f = File.open "foo.txt", "r"
 => #<File:foo.txt> 
1.9.3p286 :006 > f.set_encoding "bom|utf-16be:euc-jp"
 => #<File:foo.txt> 
1.9.3p286 :007 > f.internal_encoding
 => #<Encoding:EUC-JP> 
1.9.3p286 :008 > f.external_encoding
 => #<Encoding:UTF-16BE> 
1.9.3p286 :009 > f.close
 => nil 
1.9.3p286 :010 > f = File.open "foo.txt", "r"
 => #<File:foo.txt> 
1.9.3p286 :011 > f.set_encoding "bom|utf-16be", "euc-jp"
ArgumentError: unknown encoding name - bom|utf-16be
	from (irb):11:in `set_encoding'
	from (irb):11
	from /Users/brian/.rvm/rubies/ruby-1.9.3-p286/bin/irb:16:in `<main>'
1.9.3p286 :012 > f = File.open "foo.txt", "w", :external_encoding => "bom|utf-16be"
ArgumentError: unknown encoding name - bom|utf-16be
	from (irb):12:in `initialize'
	from (irb):12:in `open'
	from (irb):12
	from /Users/brian/.rvm/rubies/ruby-1.9.3-p286/bin/irb:16:in `<main>'
1.9.3p286 :013 > f = File.open "foo.txt", "rb", :encoding => "bom|utf-16be"
 => #<File:foo.txt> 

Thanks,
Brian


-- 
http://bugs.ruby-lang.org/