Hi!

I recognized a problem when using special character modifying sequences
(\M-x and \C-x) in utf-8 encoded strings...

t = "a\M-aa"
puts t.encoding                     # => <Encoding:UTF-8>
puts t.length                       # => 2
puts t.bytesize                     # => 3
t.each_byte{|b|print("%X " %b)}     # => 61 E1 61
puts
t.each_char{|c|print("%X" % c.ord)} # => utf8manipul.rb:7:in `each_char':
                                     # => index out of range (IndexError)
                                     # =>         from utf8manipul.rb:7:
                                     # => in `<main>'
                                     # => 61

..., because "\M-a" will produce 0xE1, which is a possible starting byte
for a three byte utf-8 encoded character.

Another utf-8 destroying situation occurs with "\C-„‚•„"...

t = "„‚•„\C-„‚•„"
puts t.encoding                      # => <Encoding:UTF-8>
puts t.length                        # => 4
puts t.bytesize                      # => 6
t.each_byte{|b|print("%X " %b)}      # => E2 82 AC 82 82 AC
puts
t.each_char{|c|print("%X " % c.ord)} # => 20AC 82 82 AC

..., because it produces the leading byte 0x82 for an utf-8 encoding
sequence, which is invalid. The resulting String is an ill-formed
utf-8 sequence.

My proposal is either not to allow these special modifiers for other
encodings than Ascii (which might be complicated), or to enforce
Ascii encoding for Strings, that contain these special modifiers.

Wolfgang N√°dasi-Donner