On Mon, Nov 12, 2012 at 10:12 PM, Charles Hixson
<charleshixsn / earthlink.net> wrote:
> Brian Candler wrote:

>> As others have said: using a Hash may be more efficient, if there are
>> large gaps - but then you may have to sort the keys if you want to
>> iterate over it in order. It depends on your use case.
>>
> Hashes are slow compared to direct indexing.

Did you measure it?  If not, please do.  It's easy to arrive at wrong
conclusions by assuming what may be truth in other contexts in true in
a new context as well.

> I'm expecting this array to eventually use over a GB of storage, so

Are you talking about memory or number of instances?  I am asking
because both are not the same and it may be more difficult to asses
memory consumption than you think.

> efficiency considerations are somewhat significant.  gdbm solves some of
> these, by not dealing with the entire array at once, but only with the
> recently active instances.  But it's slower, which I was trying to avoid by
> using an Array.  However, I don't know just how many items the Array will
> need to hold, it could easily be in the millions, and will at least be in
> the hundreds of thousands.

So how do you know we are talking about "over a GB of storage"?

>  So pre-allocating isn't very desirable, but
> neither is adding instance by instance.  (Besides, they won't necessarily
> come in order.  It could well be that I'll get the 100,000th entry
> immediately after getting the 21st.  No problem for a hash, but if I'm going
> to slow down to hash speed, I should probably just go directly to gdbm.)

Again: measure before you judge.  It's really much better to base
decisions on facts than on speculation.

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/