On Wed, 18 Aug 2004 19:31:01 +0900, Robert Klemme <bob.news / gmx.net> wrote: > "Austin Ziegler" <halostatue / gmail.com> schrieb im Newsbeitrag > news:9e7db91104081713254f2eb39e / mail.gmail.com... >> str = '<span id="1"> <span> ...</span> </span> ' >> re = /(<(\/?)span> )/i >> >> str.scan(re) >> # => [["<span> ", ""], ["</span> ", "/"], ["</span> ", "/"]] >> >> matches = [] >> str.scan(re) do >> matches << Regexp.last_match >> end >> >> matches.each do |match| >> match.captures.each_with_index do |capture, ii| >> soff, eoff = match.offset(ii + 1) >> puts %Q("#{capture}" #{soff} .. #{eoff}) >> end >> end > While that works, isn't it ridiculous that one has to resort to a > class method ("Regexp.last_match")? I mean, there should rather be > something like > > /o/.each( "foo" ) do |md| > # md is MatchData > end There's a simple solution, and I'll probably open an RCR about this if others agree with it. String#scan, #sub, and #gsub should yield MatchData objects, not Strings. There are probably others, but those are the ones that come to mind. This *will* break some code, unfortunately, but that can be mitigated by adding #to_str. IMO, this will make #gsub much easier to deal with, as you won't have to resort to either Regexp.last_match or $[0-9] variables to be able to work with captures. My Regexp.last_match call only presumes that Regexp.last_match is actually threadsafe, whereas we know that the ugly Perlish $ variables are threadsafe. I think this is an acceptable level of incompatibility because of the use of #to_str and the amount of flexibility that would be gained. As far as I know, it wouldn't require *that* big a change, because for Regexp.last_match to work, there must still be a MatchData object *somewhere*. What do you think? -austin -- Austin Ziegler * halostatue / gmail.com * Alternate: austin / halostatue.ca