On Apr 13, 2009, at 6:12 PM, NARUSE, Yui wrote:

> And these are set Regexp::FIXEDENCODING.
> This raise exceptions on strings with other encodings
> even if the regexp contains only 7-bit.
> The constant Regexp::FIXEDENCODING is defined in 1.9.2
> but the value is also used in 1.9.1.

I'm sorry, but I don't think I understood this.  I tried to check it  
in irb, but that confused me even more:

$ irb_dev
irb(main):001:0> Regexp::FIXEDENCODING
=> 16

Can you explain what the magic 16 means here please?

>> * A / literal that would be US-ASCII due to the source Encoding  > or /n will be upgraded to ASCII-8BIT by hex, octal, control, meta,  > or control-meta byte escapes (as discussed in [ruby-core:23184])
> simillar to above, /n raise warnings on other than ASCII-8BIT strings.

I'm not sure I understand.  What wouldn't be valid in ASCII-8BIT?

>> * A / literal will receive a UTF-8 Encoding if it includes \u  
>> escapes
>> * Regexp objects constructed with Regexp::new() receive the  
>> Encoding of the String passed containing the regular expression
>> Am I right so far?  Am I missing any variations?
>> Am I right that Regexp's favor US-ASCII because it maximizes their  > compatibility?  It makes it so you can use them on any ASCII  
>> compatible String instead of just a String in the source Encoding,  > right?
>
> Yes, and if you set Regexp::FIXEDENCODING the regexp will match only  he
> same encoding.

Again, I'm not sure how I set this.

I really appreciate all your help.  Sorry I was too dumb to understand  his time.