On Wed, Jan 1, 2014 at 10:46 PM, Jabari Z. <lists / ruby-forum.com> wrote:
> Ryan Davis wrote in post #1131938:
>> On Dec 31, 2013, at 8:40, Jabari Z. <lists / ruby-forum.com> wrote:
>>
>>> I have an algorithm which uses an array of boolean (True/False) data.
>>> I would like to conserve and extend memory use by having the array
>>> use less memory.
>>
>> Why? Do you have a measurable problem?
>
> Yes.
>
> Ideally I want to store the boolean (True/False) data in arrays that
> take the least amount of memory to enable me to run out of the cpu cache
> as much as possible and to allow the algorithm to search over larger
> numbers.

What algorithm is that?

> Currently I'm memory limited on the size of numbers I can process.
>
> On a 32/64 bit cpu a boolean would take 4/8 bytes of memory.
> Even if I could do just byte addressing I could compute up to 4/8 times
> larger numbers, and with bit arrays 32/64 times larger number within the
> same address space, which I could fit in cache memory.
>
> This is done all the time with languages like C/++, et al, which allow
> you to control hardware at the bit level of registers and memory.

Well, if your performance needs are _that_ pressing then Ruby may not
be the proper tool.

> If you chunk your sizes correctly, the performance/memory advantages can
> exceed the administrative overhead versus standard arrays,
>
> It would be nice if Ruby would include this capability natively.

There are indeed some options (see below - albeit probably not exactly
what you expect) but I am not sure whether the performance gains you
are hoping for will be realized. The reason is that if the rest of
your algorithm is implemented in Ruby and it allocates objects then
there will be overhead of Ruby's GC plus you have no control over the
placement in memory. The effect of well aligned byte arrays in memory
may just not measurable.

> I did try the gems that were suggested (Peter Cooper and Ingram's) and
> they do work for me. Now I want to see if I can make them faster or find
> something that is.

There are two quite obvious alternatives:
 - use Integers
 - use Strings

I once cooked something together using Strings with an optimization
using chunks:
https://gist.github.com/rklemme/2775600

Note sure what state that is in.

If you rarely change bits and query frequently then a simple Integer
works pretty well because operator [] is used for bit indexing:

irb(main):003:0> flag = '0011011010'.to_i(2)
=> 218
irb(main):004:0> flag.to_s 16
=> "da"
irb(main):005:0> 8.times {|idx| printf "%3d %1d\n", idx, flag[idx]}
  0 0
  1 1
  2 0
  3 1
  4 1
  5 0
  6 1
  7 1
=> 8

If there are frequent updates then solutions using String are more
efficient.  If you consider non Ruby solutions then of course a C
solution is even more efficient - if that difference is measurable.  I
would always start with a Ruby implementation and only switch to a C
implementation if Ruby proves too slow.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/