On Sat, Feb 6, 2010 at 3:23 PM, Brian Candler <b.candler / pobox.com> wrote:
> Michal Suchanek wrote:

>> "ajabcabck".scan /^a*j(?:b*(a+)b+c*)+k$/
>> => [["a"]]
>>
>>
>> clearly the a+ group must match twice to match the string from ^ to $
>> but only single match is returned.
>
> But the regular expression you're passing is anchored, so the entire
> regexp is only matched once, and it only contains one capturing group.

Well I think that I understand what the OO is saying, let's break the
match down:

"ajabcabck".match /^a*j(?:b*(a+)b+c*)+k$/

 /^a*j/  matches "aj" leaving "abcabck"
/(?:b*(a+)b+c*)+ matches "abcabc" leaving "k"
/k$/ matches "k" and we're done

Now there's a capture group inside that second part a non-capture
group which can (and does in this case repeat).

Since it repeats one might think that there would be one capture for
each repetition, but there isn't. Only the first actually gets
captured.

Here's a simpler example:

/^(a)+$/.match("aa").to_a
=> ["aa", "a"]


Also see http://www.regular-expressions.info/captureall.html
-- 
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale