------ art_8007_29965385.1139814889253 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Thank you for your response, Tanaka-san. On 2/12/06, Masahiro TANAKA <masa / ir.isas.ac.jp> wrote: > This is probably because a function (SetFuncs) is called > every step of the loop. How about using na_change_type() ? Your suggestion is very fast when the types are matched, but if the original is not a double, the penalty for copying the array is significant. Also, it's incorrect if the type cannot be represented as a float (e.g. complex). (see benchmarks below) I appreciate the general nature on the design of NArray, and I've learned several nice tricks from investigating it. However, the moral I'm taking home is that if speed, efficiency and generality are all at issue, individual routines for each NArray type are still the best way to go at the C level. To build extensions with one algorithm description, it seems the best way is some macro type pseudo C code that can be parsed to generate multiple C functions for each type. Do you agree? This seems to justify the approach of the PDL project (perl) and the PP language that is employed for general extensions. (and I was so hoping to keep this at the straight forward C level to exploit the beauty of ruby's C API. ahh well - it is just C). Any additional comments or suggestions are welcome, I'd love to find something I missed. Thanks! Cameron ---- Benchmarking based on: num_runs = 1000 data_size = 100000 ============================================================ ==> INT <== creating vectors: NArray.int(data_size).random!(100) user system total real NArray_rb 1.050000 0.410000 1.460000 ( 1.459850) NArray_mod 2.670000 0.030000 2.700000 ( 2.706433) NArray_ct 2.220000 1.600000 3.820000 ( 3.819964) ============================================================ ==> SFLOAT <== creating vectors: NArray.sfloat(data_size).random!(100) user system total real NArray_rb 2.590000 0.340000 2.930000 ( 2.928844) NArray_mod 2.440000 0.020000 2.460000 ( 2.467045) NArray_ct 2.100000 1.610000 3.710000 ( 3.715267) ============================================================ ==> FLOAT <== creating vectors: NArray.float(data_size).random!(100) user system total real NArray_rb 2.800000 0.800000 3.600000 ( 3.605653) NArray_mod 3.920000 0.010000 3.930000 ( 3.934259) NArray_ct 1.020000 0.000000 1.020000 ( 1.026064) ============================================================ ==> COMPLEX <== creating vectors: NArray.complex(data_size).random!(100) user system total real NArray_rb 5.970000 1.420000 7.390000 ( 7.390152) NArray_mod 3.950000 0.020000 3.970000 ( 3.971336) NArray_ct 2.820000 1.590000 4.410000 ( 4.411596) ------ art_8007_29965385.1139814889253 Content-Type: application/x-ruby; name=na_wmean.rb Content-Transfer-Encoding: 7bit X-Attachment-Id: f_ejmfutxc Content-Disposition: attachment; filename="na_wmean.rb" #!/usr/bin/env ruby require 'rubygems' require 'narray' require 'inline' class NArray inline do |builder| builder.add_compile_flags %q(-I /export/home/cameron/sys/narray-0.5.8/) builder.include '"narray.h"' builder.include '"narray_local.h"' # few local things used in linspace builder.c_raw <<-'END_CODE' VALUE wmean_dbl(int argc, VALUE *argv, VALUE self) { int i; struct NARRAY *nv, *nw; double p_sum .0, w_sum .0; GetNArray(self, nv); GetNArray(argv[0], nw); if(nv->total ! w->total) rb_raise( rb_eArgError, "Vector and weight must be same size!" ); for(i ; i < nv->total ; i++) { p_sum + (double *)nw->ptr)[i] * ((double *)nv->ptr)[i]; w_sum + (double *)nw->ptr)[i]; } return rb_float_new( p_sum / w_sum); } END_CODE builder.c_raw <<-'END_CODE' VALUE wmean_ct(int argc, VALUE *argv, VALUE self) { int i; struct NARRAY *nv, *nw; double p_sum .0, w_sum .0; VALUE vself, varg0; vself a_change_type(self, NA_DFLOAT); varg0 a_change_type(argv[0], NA_DFLOAT); GetNArray(vself, nv); GetNArray(varg0, nw); if(nv->total ! w->total) rb_raise( rb_eArgError, "Vector and weight must be same size!" ); for(i ; i < nv->total ; i++) { p_sum + (double *)nw->ptr)[i] * ((double *)nv->ptr)[i]; w_sum + (double *)nw->ptr)[i]; } return rb_float_new( p_sum / w_sum); } END_CODE builder.c_raw <<-'END_CODE' VALUE wmean(int argc, VALUE *argv, VALUE self) { int i,sv,sw; struct NARRAY *nv, *nw; double wt,val; double p_sum .0, w_sum .0; char *v,*w; void (*na_getv)(); void (*na_getw)(); GetNArray(self, nv); GetNArray(argv[0], nw); if(nv->total ! w->total) rb_raise( rb_eArgError, "Vector and weight must be same size!" ); na_getv etFuncs[NA_DFLOAT][nv->type]; na_getw etFuncs[NA_DFLOAT][nw->type]; sv a_sizeof[nv->type]; sw a_sizeof[nw->type]; v v->ptr; w w->ptr; for(i ; i < nv->total ; i++) { (*na_getv)( 1, &val, 0, v, 0 ); (*na_getw)( 1, &wt, 0, w, 0 ); v + v; w + w; p_sum + wt) * (val); w_sum + wt); } return rb_float_new( p_sum / w_sum); } END_CODE end end ------ art_8007_29965385.1139814889253 Content-Type: application/x-ruby; name=bench_wmean.rb Content-Transfer-Encoding: 7bit X-Attachment-Id: f_ejmfv8zn Content-Disposition: attachment; filename="bench_wmean.rb" #!/usr/bin/env ruby require 'benchmark' include Benchmark # require 'gsl' require 'na_wmean.rb' n _000 data_size 00_000 def wmean_orig(xt,wt) (xt * wt).sum / wt.sum end puts "Benchmarking based on: " puts " num_runs {n}" puts " data_size {data_size}" x il for type in ["int", "sfloat", "float", "complex"] puts "