Perry Smith wrote:
> I think I'm going to open a bug report -- it might not be a bug but I 
> sure am confused.

It's not a bug(*), and it sure is confusing. My own attempt to document 
Ruby 1.9's encoding rules, which is woefully incomplete but covers about 
200 different cases, is at
http://github.com/candlerb/string19/blob/master/string19.rb

What you've observed is described in section 3.3.

Basically, a Regexp which contains only ASCII characters is given an 
encoding of US-ASCII regardless of the original string's encoding (this 
is different to Strings, which might have an encoding of say UTF-8 but 
have the ascii_only? property true if they contain only ASCII 
characters).

However there is a hidden "fixed_encoding" property you can set on a 
Regexp:

>> r1 = Regexp.new("string")
=> /string/
>> r2 = Regexp.new("string", Regexp::FIXEDENCODING)
=> /string/
>> r1.encoding
=> #<Encoding:US-ASCII>
>> r2.encoding
=> #<Encoding:UTF-8>
>> r1.fixed_encoding?
=> false
>> r2.fixed_encoding?
=> true

I say it's a "hidden" property because the flag isn't revealed if you 
use inspect or to_s (unlike the //m, //i and //x properties)

>> r1.to_s
=> "(?-mix:string)"
>> r2.to_s
=> "(?-mix:string)"

HTH,

Brian.

(*) Except in as much as the entire Encoding nonsense in ruby 1.9 is one 
enormous bug
-- 
Posted via http://www.ruby-forum.com/.