Issue #8210 has been updated by naruse (Yui NARUSE).


k_takata (Ken Takata) wrote:
> This problem was caused by optimization of \z.
> I wrote two patches to fix this problem.
> 
> Maybe fix-8210-1.diff is more efficient than fix-8210-2.diff,
> but the former one tries to do backward search when 'start==range'
> after 'start' is adjusted. This behavior is a little bit confusing.

k_takata (Ken Takata) wrote:
> This problem was caused by optimization of \z.
> I wrote two patches to fix this problem.
> 
> Maybe fix-8210-1.diff is more efficient than fix-8210-2.diff,
> but the former one tries to do backward search when 'start==range'
> after 'start' is adjusted. This behavior is a little bit confusing.

I think -1 is suitable because it looks to keep original intention more than -2.
----------------------------------------
Bug #8210: Multibyte character interfering with end-line character within a regex
https://bugs.ruby-lang.org/issues/8210#change-38450

Author: sawa (Tsuyoshi Sawada)
Status: Assigned
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: M17N
Target version: current: 2.1.0
ruby -v: 2.0


=begin
With this regex:

    regex1 = /\z/

the following strings match as expected:

    "hello" =~ regex1 # => 5
    "????????лу?бу??" =~ regex1 # => 5

but with these regexes:

    regex2 = /#$/?\z/
    regex3 = /\n?\z/

they show difference:

    "hello" =~ regex2 # => 5
    "hello" =~ regex3 # => 5
    "????????лу?бу??" =~ regex2 # => nil
    "????????лу?бу??" =~ regex3 # => nil

The string encoding is UTF-8, and the OS is Linux (i.e., `$/` is `"\n"`). I expect them to behave the same, and believe this is a bug.
=end


-- 
http://bugs.ruby-lang.org/