Issue #8110 has been updated by sam.saffron (Sam Saffron).


@naruse 

There is a perf implication that really needs addressing here that would help elsewhere: 

in re.c, there is a whole bunch of work that can be avoided when NO_BACKREF is passed in for the match: 

In particular:

     match = match_alloc(rb_cMatch);
	onig_region_copy(RMATCH_REGS(match), regs);
	onig_region_free(regs, 0);
    }
    else {
	if (rb_safe_level() >= 3)
	    OBJ_TAINT(match);
	else
	    FL_UNSET(match, FL_TAINT);
    }

    RMATCH(match)->str = rb_str_new4(str);
    RMATCH(match)->regexp = re;
    RMATCH(match)->rmatch->char_offset_updated = 0;
    rb_backref_set(match);

    OBJ_INFECT(match, re);
    OBJ_INFECT(match, str);

This in turn should improve the performance of regex matching with the /B option quite a lot. 

I have been looking at this recently due to some performance issues I noticed on Active Supports String#blank? 

The c implementation of: 

  def blank?
    self !~ /[^[:space:]]/
  end


is the somewhat crazy: 

https://github.com/SamSaffron/fast_blank/blob/master/ext/fast_blank/fast_blank.c#L16-L55

This implementation is 5 to 8x faster. 

I vote for:

* new option for Regexp like Regexp.new("foo", Regexp::NO_BACKREF) AND /foo/B

You can then feature detect if its available by looking for Regexp::NO_BACKREF

I do wonder how much faster this will be for my micro benchmark vs the native c implementation, when you are done can you ping me so I can bench it? (at sam.saffron / gmail.com) 

----------------------------------------
Feature #8110: Regex methods not changing global variables
https://bugs.ruby-lang.org/issues/8110#change-38128

Author: prijutme4ty (Ilya Vorontsov)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: next minor


It is useful to have methods allowing pattern matching without setting global variables. It can be very hard to understand where the problem is when you for example insert a string like `puts pat === my_str` and your program fails in a place which is far-far away from inserted place. This can happen due to replacing global variables of previous pattern match. I caught to this when placed pattern-match inside case-statement and shadowed global vars which were initially filled by match in when-statement.
For now one can extract pattern matching into another method thus defining method-scope for that variables. But sometimes it looks like an overkill. May be simple method like #match_globalsafe can prevent that kind of errors. At least when a programmer see such a method in a list of methods, he's warned that usual match can cause such problems.


-- 
http://bugs.ruby-lang.org/