Issue #16145 has been reported by zenspider (Ryan Davis).

----------------------------------------
Bug #16145: regexp match error if mixing /i, character classes, and utf8
https://bugs.ruby-lang.org/issues/16145

* Author: zenspider (Ryan Davis)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
(reported on behalf of mage / mage.gold -- there appears to be an error in registration or login):

See: ruby-talk @ X-Mail-Count: 440336

2.6.3 :049 > 'SHOP' =~ /[xo]/i
 => 2
2.6.3 :050 > 'CAF' =~ /[]/i
 => 3
2.6.3 :051 > 'CAF' =~ /[x]/i
 => nil
2.6.3 :052 > 'CAF' =~ /[x]/i
 => 3

Expected result: 
 2.6.3 :051 > 'CAF' =~ /[x]/i 
=> 3

I tested it on random regex online pages.

It does not match on https://regex101.com/

It matches on:

https://regexr.com/
https://www.regextester.com/
https://www.freeformatter.com/regex-tester.html

(Ignore case turned on).

The reason I suppose it°«s more like a bug than a feature is the fact that /[]/i matches 'CAF'. If the //i didn°«t work for UTF-8 characters then the /[]/i wouldn°«t match it either. For example, [] does not match 'CAF' on https://regex101.com/

I could not find a page or a system that behaves the same way as Ruby does. For example, it matches in PostgreSQL 10 (under FreeBSD 12) too:

# select 'CAF'~ '[x]';
 ?column?
----------
 f
(1 row)

# select 'CAF' ~* '[x]';
 ?column?
----------
 t
(1 row)

Tested it in IRB on macOS and FreeBSD.

 $ uname -a && ruby -v && locale
Darwin xxx 18.7.0 Darwin Kernel Version 18.7.0: Thu Jun 20 18:42:21 PDT 2019; root:xnu-4903.270.47~4/RELEASE_X86_64 x86_64
ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-darwin18]
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

$ uname -a && ruby -v && locale
FreeBSD xxx 12.0-RELEASE-p9 FreeBSD 12.0-RELEASE-p9 GENERIC  amd64
ruby 2.6.3p62 (2019-04-16 revision 67580) [x86_64-freebsd12.0]
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=en_US.UTF-8

I installed Ruby with RVM.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>