I thought to share my problem in case you might have major improvement ideas
for it. Here's a sample code which shows how my current version is working,
and that it's performing unacceptably slow. This is my first "real-world"
application with Ruby so I've to say I'm quite surprised, and I wouldn't
describe my feelings using phrases "wow, that's awesome!", "cooool!" or
something :). So I'm quite sure there's quite much I can do for it, just
don't know what.

I think the major problem is that I'm creating tons of objects, and most of
the time is spent on garbage collection. Actually the real data does not got
lost at any point, there's just probably too many inobjects before final
form.

Anyway, I was expecting (about) linear progression of my 24 000 item
creation and indexing. The reality was strucking and surprising. The process
time grows, more than just reallocation of store hash. And that should be be
quite rare event if I'm right.

clock time of measurement
    |         processed
    |            |    time spent on processing these
    |            |      |
21:49:20.692      0: 
21:49:21.696   1000:  1.0033
21:49:26.177   2000:  4.4810
21:49:31.275   3000:  5.0985
21:49:38.313   4000:  7.0378
21:49:46.680   5000:  8.3666
21:49:56.060   6000:  9.3803
21:50:08.024   7000:  11.9637
21:50:20.632   8000:  12.6087
21:50:36.306   9000:  15.6736
21:50:52.299  10000:  15.9934
21:51:09.934  11000:  17.6344
21:51:32.223  12000:  22.2889
21:51:51.977  13000:  19.7542
21:52:15.868  14000:  23.8907
21:52:40.387  15000:  24.5198
21:53:05.641  16000:  25.2538
21:53:35.478  17000:  29.8363
21:54:02.646  18000:  27.1685
21:54:34.499  19000:  31.8530
21:55:06.679  20000:  32.1800
21:55:39.463  21000:  32.7836
21:56:18.285  22000:  38.8225
21:56:53.563  23000:  35.2782
21:57:34.268  24000:  40.7044

I thought it would be a snap to read (generate) 24 000 entries, collect keys
and store them. Well, with quite average intel computer it took over 7
minutes (of CPU time). 

Before you start to tweak my program you should keep few things in mind:
1) In reality there's about this many entries in total, but this 
   code is just for initialization. The real program will 
   overwrite existing entries many many times.
2) I'm getting the data in format used in example.
3) Field names and values, and their count, is not predeterminable.
   The information which fields compose the key is known neither.
4) The store will be searched often and the data inside will be heavily 
   operated. This rules out solution to use 'store' hash in following format
  { "FieldValue1/FieldValue2" => "original data string", ... }

Ok, I guess I've said enough, or even too much. One more point, though, the
code. And thanks beforehand.

	- Aleksi


  # Helper routines for time printing
  def time(earlier=nil)
    t = Time.now
    $stderr.printf("%02d:%02d:%02d.%03d", 
		   t.hour, t.min, t.sec, t.usec.to_i/1000)
    if earlier
      printf(" %6.4f", t - earlier )
    end
    puts
    t
  end

  store = {}                   # where we stuff items
  key_fields = %w(kkkkkk rrrr) # what 'names' provide the uniq id
  mod = "aaaa"                 # something random for the entry (and key)
  t=nil                        # for time difference

  24001.times do |i|
    # our real world data, the content varies actually, we modify
    # each entry only to get unique key for 'store'
    str = ("ddddddddddd=0.00&oooooooooooo=0&aaaaaaa=0.00&llll=0.00&"+
           "yyyyyy=0.00&pppp=0.00&rrrr=abcd1234#{mod}&vvvvvv=0&"+
           "eeee=5.90&bbb=0.00&sss=0.00&ttttt=NIL&mmmmmmm=0&"+
           "kkkkkk=ABCDE&nnnnnnnnnn=0.00&qqqqqq=12170305")
    mod.succ!

    # transfer input data into nice datastructure
    entry = {}
    str.split(/&/).each do |field|
      name, value = field.split(/=/)
      entry[name] = value
    end

    # find out the key
    key = []
    key_fields.each do |kf|
      key << entry[kf]
    end

    # keep each entry easily accessible
    store[key] = entry
  
    # let's print time for progressing for each 
    # 1000 items we process
    if i%1000 == 0
      printf(" %6d: ", i)
      t = time(t)
    end
  end

  puts "done!"

  sleep 120