On Sat, 2 Jul 2005, Joe Van Dyk wrote:

> I thought for all of five seconds for a good subject line for this
> question, but failed.  Sorry!
>
> I have a string like:
>
> "some_key: blah, some_other_key: more_blah, yet_other_key: yet_more_blah"
>
> I want to build up a hash like
>
> { :some_key => "blah", :some_other_key => "more_blah", :yet_other_key
> => "yet_more_blah" }
>
> And I don't really want to have to know what the possible keys are in advance.
>
> So, the message format looks like:
> <key>: <value>, <key>: <value>
>
> How can I properly extract it out?
>
> Here's my initial attempt, which works, but seems hackish:
>
>      attributes = message.split(",")
>      attributes.each do |attribute|
>        key, value = attribute.scan(/(\w+): (.+)/)[0]
>        result_hash[key.to_sym] = value.strip
>      end
>
> Also, this will get ran potentially thousands of times per second, so
> executation speed is of some concern.

you'll have a hard time getting much faster than strscan:

   harp:~ > cat a.rb
   require 'strscan'

   class HashString < ::Hash
     class SyntaxError < StandardError; end
     def initialize s, dup = false
       load_from s, dup
     end
     def load_from s, dup = false
       @ss = StringScanner::new s, dup
       loop do
         key, value = scan_key, scan_value
         self[key] = value
         break if eos?
       end
       @ss = nil
     end
     def scan_key
       @ss.scan(%r/[\n\s]*([^:\n]+)[\n\s]*(?=:)/o) or syntax_error
       key = @ss[1]
       @ss.scan(%r/[\n\s]*:[\n\s]*/o) or syntax_error
       key
     end
     def scan_value
       scan(%r/[\n\s]*([^,\n]+)[\n\s]*/o) or syntax_error
       value = @ss[1]
       scan(%r/[\n\s]*,?[\n\s]*/o)
       value
     end
     def eos?
       @ss.eos?
     end
     def scan pat
       @ss.scan pat
     end
     def syntax_error
       raise SyntaxError, @ss.peek(16) + '...'
     end
     def to_yaml
       {}.merge(self).to_yaml
     end
   end

   s = <<-txt
     some_key: blah,
   some_other_key: more_blah, yet_other_key:
           yet_more_blah
   txt

   hs = HashString::new s

   require 'yaml'
   y hs


   harp:~ > ruby a.rb
   ---
   some_key: blah
   yet_other_key: yet_more_blah
   some_other_key: more_blah


strscan is pure c and extremely fast.  it doesn't end up creating any new
strings like spliting or regex based solutions.  it keeps a pointer into the
string and moves through it.  it takes some getting used to be is really good
and part of the standard dist.

cheers.

-a
-- 
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple.  My religion is kindness.
| --Tenzin Gyatso
===============================================================================