On Jan 6, 2006, at 8:33 PM, dblack / wobblini.net wrote: >>>> example = %Q{some words "some quoted text" some more words} >> => "some words \"some quoted text\" some more words" >>>> example.scan(/\s+|\w+|"[^"]*"/). >> ?> reject { |token| token =~ /^\s+$/ }. >> ?> map { |token| token.sub(/^"/, "").sub(/"$/, "") } >> => ["some", "words", "some quoted text", "some", "more", "words"] > > I think you could do less work: > > example.scan(/"[^"]+"|\S+/).map { |word| word.delete('"') } > > (Or am I overlooking some reason you'd want to capture sequences of > spaces?) > > I changed the \w+ to \S+ (and moved it after the | to avoid having it > sponge up too much) in case the words included non-\w characters. You're right, that's better all around. > I guess with zero-width positive lookbehind/ahead one could do it > without the map operation. You can drop the map(), if you're willing to replace it with two other calls: >> example = %Q{some words "some quoted text" some more words} => "some words \"some quoted text\" some more words" >> example.scan(/"([^"]+)"|(\S+)/).flatten.compact => ["some", "words", "some quoted text", "some", "more", "words"] James Edward Gray II