On Jan 6, 2006, at 8:33 PM, dblack / wobblini.net wrote:

>>>> example = %Q{some words "some quoted text" some more words}
>> => "some words \"some quoted text\" some more words"
>>>> example.scan(/\s+|\w+|"[^"]*"/).
>> ?>         reject { |token| token =~ /^\s+$/ }.
>> ?>         map { |token| token.sub(/^"/, "").sub(/"$/, "") }
>> => ["some", "words", "some quoted text", "some", "more", "words"]
>
> I think you could do less work:
>
>   example.scan(/"[^"]+"|\S+/).map { |word| word.delete('"') }
>
> (Or am I overlooking some reason you'd want to capture sequences of
> spaces?)
>
> I changed the \w+ to \S+ (and moved it after the | to avoid having it
> sponge up too much) in case the words included non-\w characters.

You're right, that's better all around.

> I guess with zero-width positive lookbehind/ahead one could do it
> without the map operation.

You can drop the map(), if you're willing to replace it with two  
other calls:

 >> example = %Q{some words "some quoted text" some more words}
=> "some words \"some quoted text\" some more words"
 >>  example.scan(/"([^"]+)"|(\S+)/).flatten.compact
=> ["some", "words", "some quoted text", "some", "more", "words"]

James Edward Gray II