>>>>> "W" == Wesley J Landaker <wjl / mindless.com> writes:

W> Of course, the ONLY difference in the regexes is one has (^|.) and the 
W> other has (.|^). 

 A example more simple :

pigeon% ruby -e 'puts "OK" if /(.|a)bd/ =~ "cxbd"'
pigeon% ruby -e 'puts "OK" if /(a|.)bd/ =~ "cxbd"'
OK
pigeon% 

 The problem is in its optimization, more precisely here


      if (mcnt == 4 && *laststart == anychar) {
	switch ((enum regexpcode)laststart[1]) {
	case jump_n:
	case finalize_jump:
	case maybe_finalize_jump:
	case jump:
	case jump_past_alt:
	case dummy_failure_jump:
	  bufp->options |= RE_OPTIMIZE_ANCHOR;
	  break;
	default:
	  break;
	}
      }

 This mean that it has found a regexp beginning with a branch | where the
 first possibility is . (anychar). In this case it put the flag 
 RE_OPTIMIZE_ANCHOR (which mean that it must match only at the beginning of
 the line). 

 But it can't do that, precisely for the 2 examples given


Guy Decoux