Hi --

On Fri, 15 Aug 2003, Dan Doel wrote:

> What about:
>
> re = /((?@<paths>(\/\w+)+\/)(?@<filenames>\w+),?)+/
>
> m = re.match("/usr/local/bin/ruby,/bin/bash,/sbin/reboot")
>
> m['paths']     # => ["/usr/local/bin/", "/bin/", "/sbin/"]
> m['filenames'] # => ["ruby", "bash", "reboot"]
>
> My regexps may be off (didn't test :)), but you get the idea.
> Wouldn't that require multiple regexps with scan?

No, one will do it, though the results come back in a somewhat
different form:

  str="/usr/local/bin/ruby,/bin/bash,/sbin/reboot"
  re=/((?:\/\w+)+)\/(\w+),?/
  str.scan(re)
=> [["/usr/local/bin", "ruby"], ["/bin", "bash"], ["/sbin", "reboot"]]

The proposed regex extension would be more concise if you wanted to
flip these into two arrays, though probably less concise for other
things, like making a hash from them.  I guess it's a matter of
weighing that potential conciseness against what feels to me like the
pretty major step of regexes including arbitrary strings which don't
convey any information about the match.  I think my own inclination
would be to want to keep that kind of information off-loaded and out
of the regex itself.

I'm also wondering what effect the proposed extension would have on
MatchData#to_a, if the MatchData object had non-numerical indices.


David

-- 
David Alan Black
home: dblack / superlink.net
work: blackdav / shu.edu
Web:  http://pirate.shu.edu/~blackdav