|From: Cameron McBride
> I'm trying to make the fast extensions to NArray, yet still preserve
> the general nature of methods (i.e. works on all NArray types).  As an
> example, I'll show benchmarks of a simple weighted mean:
> 
> Benchmarking based:
>   num_runs  = 1000
>   data_size = 100000
>                  user     system      total        real
> NArray_rb      3.150000   0.660000   3.810000 (  3.803652)
> NArray_mod	 3.930000   0.020000   3.950000 (  3.949027)
> NArray_dbl	 1.040000   0.000000   1.040000 (  1.039897)
> GSL            1.670000   0.000000   1.670000 (  1.665538)

> I did the quick NArray_dbl hack, which is fast but explicitly casts to
> 'double', so it's not general.  When I generalized it using some
> internal SetFuncs of NArray, the result is slower than a ruby version
> that does a double loop (two #sum calls).
> 
> What am I missing?  As a success meter, I'd like it to be at least as
> fast as GSL libs.

This is probably because a function (SetFuncs) is called
every step of the loop.  How about using na_change_type() ?

    VALUE 
    wmean_dbl(int argc, VALUE *argv, VALUE self) { 

      int i;
      struct NARRAY *nv, *nw;
      double p_sum = 0.0, w_sum = 0.0;
      VALUE vself, varg0;

      vself = na_change_type(self, NA_DFLOAT);
      varg0 = na_change_type(argv[0], NA_DFLOAT);

      GetNArray(vself, nv);
      GetNArray(varg0, nw);

      if(nv->total != nw->total) 
        rb_raise( rb_eArgError, "Vector and weight must be same size!" );

      for(i=0 ; i < nv->total ; i++) { 
        p_sum += ((double *)nw->ptr)[i] * ((double *)nv->ptr)[i];
        w_sum += ((double *)nw->ptr)[i];
      }

      return rb_float_new( p_sum / w_sum);
    }

Masahiro Tanaka