On 12/15/05, William James <w_a_x_man / yahoo.com> wrote: > Garance A Drosehn wrote: > > > ... Then, after I know the line is valid, I want > > the array of source-words, and the array of destination-words > > which were matched. I want to do that by picking out information > > in Matchdata, not by doing a new scan. [...] > > I could put another set of parenthesis around the two repeating > > groups: > > > > /^(copy|duplicate) \s+ ((\w+\s+)+) (before|after) \s+ ((\w+\s*)+) $/x > > > > But that doesn't really give me two separate arrays of the > > individual values that made up each group. It just matches > > each group as a whole. > > > > Given two data lines of: > > copy apple pear plum peach after bill bob > > duplicate tomato before joe alice alfred tommy jane > > > > in the first case I want a way to set two arrays: > > srcfood = ["apple ", "pear ", "plum ", "peach "] > > destword = ["bill ", "bob"] > > from the first line, and > > srcfood = ["tomato "] > > destword = ["joe ", "alice", "alfred ", "tommy ", "jane"] > > from the second line. > > > > I'll agree this is a weird example, but I think it shows the issue. > > If I apply the above pattern to the first line, I'll see a Matchdata > > result where: > > > > $~.captures == > > ["copy", "apple pear plum peach ", "peach ", "after", "bill bob", "bob"] > > DATA.each {|line| line.chomp! > md = > /^(?:copy|duplicate) \s+ > ((?:\w+\s+)+) > (?:after|before) \s+ > ((?:\w+\s*)+) $ > /x.match( line ) > p md.captures > src_food = md.captures.first.split > dest_word = md.captures.last.split > p src_food, dest_word > } This does happen to solve my specific example, but... ...that split only works because "you know" what the repeating pattern is. While it does not explicitly repeat the original regex, that split does the job only because the repeating pattern does not include blanks. Don't look at the *specific* pattern that I am repeating, but try to imagine *any* repeating pattern at that point in the example. Right now, can we come up with a solution where I can replace that pattern with *anything* I want to repeat, and the solution still works? Right now what ruby does is it only saves the *last* copy of however many things it matched. It does seem to me that it should save *all* copies of what it matched -- somewhere. For instance, let's say that Matchdata included another method called "repeated", and that method returns an array. This array has the same number of elements as captures does. If the pattern-segment for captures[0] is NOT a repeating pattern, then repeated[0] returns nil. If captures[1] is tied to a pattern-segment that does (possibly) repeat, then repeated[1] returns an array of strings, one element for each time that pattern-segment was found. Eg: /^(copy|duplicate) \s+ (\w+\s+)+ (before|after) \s+ (\w+\s*)+ $/x used to match against the string: "copy apple pear plum peach after bill bob' $~.captures[0] == "copy" $~.repeated[0] == nil $~.captures[1] == "peach " $~.repeated[1] == {"apple ", "pear ", "plum ", "peach "} $~.captures[2] == "after" $~.repeated[2] == nil $~.captures[3] == "bob" $~.repeated[3] == {"bill ", "bob"} Note that I wouldn't even need to add the extra '()' around '(\w+\s+)+' if ruby provided something like this. Of course, the next question is why not just make captures[1] be the array of "things" which were repeatedly matched, instead of only holding the last-instance of that repeated pattern. That would work fine, IMO, although I guess it might break the scripts of some people. -- Garance Alistair Drosehn = drosihn / gmail.com Senior Systems Programmer or gad / FreeBSD.org Rensselaer Polytechnic Institute; Troy, NY; USA