On Thu, 26 Oct 2006 poopdeville / gmail.com wrote:

> Hi everybody,
>
> I'm writing a fairly open-ended question.  I'm hoping for suggestions,
> opinions, advice.  Suppose I have n arrays, each of which has m entries.  m
> is a fairly large integer, on the order of 10,000.  Each entry is either 1
> or 0.
>
> The first task I need to accomplish is figuring out how many times a 1
> occurs in the ith entry in an array.  So for concreteness, if I had arrays:
>
> first = [1,0,0,0,0]
> second = [1,1,0,0,0]
> third = [0,0,0,1,0]
>
> I would end up with
> count = [2,1,0,1,0]

     harp:~ > cat a.rb
     require 'narray'

     first = NArray.to_na [1,0,0,0,0]
     second = NArray.to_na [1,1,0,0,0]
     third = NArray.to_na [0,0,0,1,0]

     count = first.eq(1) + second.eq(1) + third.eq(1)

     p count


     harp:~ > ruby a.rb
     NArray.byte(5):
     [ 2, 1, 0, 1, 0 ]


> I'm just trying to give the general flavor of what I'm working on.  I know I
> can use some simple each_with_index loops to increment count[index]
> (something along the lines of:)
>
> count = Array.new(m,0)
> [first, second, third].each do |array|
>  array.each_with_index do |item, index|
>    count[index] += item
>  end
> end
>
> There are going to be m * 3n * (two constant multipliers for the
> looping) object allocations and method calls.  m is fairly large, and I
> have other, similar, tasks to accomplish with this data.  The faster I
> can process this data, the more data I can process in a given amount of
> time, and the more accurate the analysis will be.

i use narray on huge (> 1gb) datasets all the time.  it's blindingly fast.


regards.

-a
-- 
my religion is very simple.  my religion is kindness. -- the dalai lama