Issue #8650 has been updated by nagachika (Tomoyuki Chikanaga).

Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN to 1.9.3: REQUIRED, 2.0.0: REQUIRED


----------------------------------------
Bug #8650: Unexpected result of Regexp#to_s with utf-16 and utf-32 string.
https://bugs.ruby-lang.org/issues/8650#change-40662

Author: phasis68 (Heesob Park)
Status: Closed
Priority: Normal
Assignee: 
Category: 
Target version: 
ruby -v: ruby 2.1.0dev (2013-07-17 trunk 42011) [i386-mingw32]
Backport: 1.9.3: REQUIRED, 2.0.0: REQUIRED


I found the result of Regexp#to_s is incorrect with utf-16 and utf-32 encoded string.

C:\Users\phasis>irb
irb(main):001:0> Regexp.new('abcd'.encode('UTF-16LE'))
=> /a b c d /
irb(main):002:0> Regexp.new('abcd'.encode('UTF-16LE')).to_s
=> "\u3F28\u6D2D\u7869\u613A\u6200\u6300\u6400\u2900"
irb(main):003:0> Regexp.new('abcd'.encode('UTF-16BE'))
=> / a b c d/
irb(main):004:0> Regexp.new('abcd'.encode('UTF-16BE')).to_s
=> "\u283F\u2D6D\u6978\u3A00\u6100\u6200\u6300\u6429"
irb(main):005:0> Regexp.new('abcd'.encode('UTF-32LE'))
=> /a   b   c   d   /
irb(main):006:0> Regexp.new('abcd'.encode('UTF-32LE')).to_s
=> "\u{6D2D3F28}\u{613A7869}\u{62000000}\u{63000000}\u{64000000}\u{29000000}"
irb(main):007:0> Regexp.new('abcd'.encode('UTF-32BE'))
=> /   a   b   c   d/
irb(main):008:0> Regexp.new('abcd'.encode('UTF-32BE')).to_s
=> "\u{283F2D6D}\u{69783A00}\u6100\u6200\u6300\u6429"

Same result for Ruby 1.9.3 and Ruby 2.0.0


-- 
http://bugs.ruby-lang.org/