On Oct 3, 2005, at 7:01 AM, Gavin Kistner wrote: > str = 'foo,bar ,, baz,qux,,,jorb,jing,,,,blat' > out = [] > str.scan( /(.+?[^,],{2}*)(?:,(?!,)|$)/ ){ |a,b| > out << a.gsub( ',,', ',' ) > } > p out > #=> ["foo", "bar , baz", "qux,", "jorb", "jing,,blat"] Whenever I find myself about to do something like the above, I say to myself: "Hey, buddy, pre-allocating an array and shoving stuff onto it in a block is neat as an exercise of the closure, but you should be using something like #map." Unfortunately, it would appear that #scan doesn't automagically map the returned value from each iteration to an array. Man, wouldn't that be nice? Following is my hackish attempt to make a String#scan_and_map function that does the above. A few questions for the gurus: a) Is there a better way to deal with bol? with StringScanner? (Boy, it'd be nice if there was a Regexp#uses_bol_at_start_of_match? method.) b) Is there a clean way to tell the 'arity' of a regexp (how many captures it has, at max)? (Boy, it'd be nice if there was a Regexp#arity method.) c) Without knowing the arity, is there a clean/fast way to gather all the 1..n submatches held in StringScanner? (Boy, it'd be nice if StringScanner gave you access to an array of subcaptures as a single property. And if it set the $1..$9 vars.) require 'strscan' class String def scan_and_map( regexp ) # A naive check for beginning of line use_bol = regexp.inspect =~ /\/(?:\((?:\?:)?)*\^/ # A naive check for sub-expression groups # Will fail for unescaped ( inside [], for example use_groups = regexp.inspect =~ /(\^|[^\\])\\{2}*\(/ results = [] ss = StringScanner.new( self ) while !ss.eos? ss.scan_until( regexp ) unless ss.match?( regexp ) if use_bol and not ss.bol? ss.pos += 1 else result = ss.scan( regexp ) if use_groups result = (1..9).to_a.map{ |i| ss[i] } end results << yield( result ) end end results end end str = 'foo,bar ,, baz,qux,,,jorb,jing,,,,blat' p str.scan_and_map( /(.+?[^,],{2}*)(?:,(?!,)|$)/ ){ |saved,others| saved } #=> ["foo", "bar , baz", "qux,", "jorb", "jing,,blat"]