Brian,


On 2011-03-19 00:42, Brian Candler wrote:
>> The other thought I had was putting the data into a sqlite3 db - I will
>> try it and see what happens but I don't imagine it would be faster than
>> a memory based hash table?
>

> Probably not, but at least it persists, and it may be cheaper to make
> updates to a large external data structure than rebuild the entire
> structure in memory each time.


Yes, using a db was much slower


> In your application, do you really need to use Strings?


I guess not but I preferred a tag of:

	01.01.01.01

to:

	1010101


> Your sample code looks like it's handling numeric-style data (although I
> realise this is just a test case for the problems you're having).
> Integers in the range -2^30..+2^30 (or larger in on a 64-bit machine)
> have their values encoded within the reference, so no memory allocation
> is done.


Are you talking about the hash key or the hash values? - the values in 
the real script will all be floats . .


> Or, if you're handling a relatively small set of unique values, you
> could use symbols instead of strings. Each symbol reference again
> doesn't allocate any memory; it just points to the entry in the symbol
> table.


Not sure what you mean - example?


> Or you could use frozen strings and share the references.
>
> LABEL1 = "00".freeze
> LABEL2 = "01".freeze
> MAP = {LABEL1 =>  LABEL1, LABEL2=>LABEL2}
> a = MAP["00"]
> puts a.object_id
> puts LABEL1.object_id


I ran that code but I don't understand how it helps . .


> Although that's more work than symbols, it might be useful depending on
> your use case. For example, you could replace a subset of the values you
> see with these frozen strings (which covers the majority of the data),
> whilst still allowing arbitrary other strings.


Still not clear - examples?

The other thing that occurred to me was that on my 64-bit machine maybe 
I could run 2-3 threads for inserting into the hash table?

BTW, I did end up changing to JSON - a YAML dump on a 32000x22 hash 
table was deadly . .

Thanks,

Phil.
-- 
Philip Rhoades

GPO Box 3411
Sydney NSW	2001
Australia
E-mail:  phil / pricom.com.au