> The other thought I had was putting the data into a sqlite3 db - I will
> try it and see what happens but I don't imagine it would be faster than
> a memory based hash table?

Probably not, but at least it persists, and it may be cheaper to make 
updates to a large external data structure than rebuild the entire 
structure in memory each time.

In your application, do you really need to use Strings?

Your sample code looks like it's handling numeric-style data (although I 
realise this is just a test case for the problems you're having). 
Integers in the range -2^30..+2^30 (or larger in on a 64-bit machine) 
have their values encoded within the reference, so no memory allocation 
is done.

Or, if you're handling a relatively small set of unique values, you 
could use symbols instead of strings. Each symbol reference again 
doesn't allocate any memory; it just points to the entry in the symbol 
table.

Or you could use frozen strings and share the references.

LABEL1 = "00".freeze
LABEL2 = "01".freeze
MAP = {LABEL1 => LABEL1, LABEL2=>LABEL2}
a = MAP["00"]
puts a.object_id
puts LABEL1.object_id

Although that's more work than symbols, it might be useful depending on 
your use case. For example, you could replace a subset of the values you 
see with these frozen strings (which covers the majority of the data), 
whilst still allowing arbitrary other strings.

-- 
Posted via http://www.ruby-forum.com/.