On Wed, May 23, 2012 at 12:22 PM, George Dupre <lists / ruby-forum.com> wrote:
> I have to:
> 1) generate a database of a couple dozens of millions of Fixnums
> (ranging from 0 to 2^32 - 1), while avoiding redundancy
> 2) iterate through them
> 3) quickly search for the presence of a given Fixnum in the database
>
> The Set class fulfills the speed conditions and conveniently handles
> redundancy itself, but its uses up too much memory. It looks like each
> entry uses up around a 100 bytes, even if I only put 4 bytes in there.
> Array#include? is too slow without solving the memory problem.
> Representing each Fixnum as 4 bytes in a huge String doesn't use up much
> memory at all and String#include? is fast enough, but I can't tell it to
> only search by 4 bytes increments.

You could use a BigNum as bitset.  But that might be slow.  I once
created a BitSet class which uses a String internally for storage.
IIRC that was quite efficient and fast.  I put up an illustrating
example here
https://gist.github.com/2775600

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/