Robert Klemme wrote:

> With your set of NUMBERS I get
> 
> $ ruby ranges-2.rb
>                          user     system      total        real
> bitset              103.984000   0.000000 103.984000 (105.677000)
> inject-range         77.735000   0.000000  77.735000 ( 78.927000)
> inject-array         29.312000   0.015000  29.327000 ( 29.803000)
> inject-array-2       28.625000   0.000000  28.625000 ( 29.301000)
> inject-array-map     29.047000   0.000000  29.047000 ( 29.543000)
> to_ranges             0.672000   0.000000   0.672000 (  0.684000)
> to_ranges_2           0.594000   0.000000   0.594000 (  0.596000)
> to_ranges-map         0.750000   0.000000   0.750000 (  0.772000)
> to_ranges_2-map       0.609000   0.000000   0.609000 (  0.640000)
> 
> Your approach has a clear advantage if ranges are fairly large compared 
> to the overall number of elements in the array.  I wonder whether there 
> is still room for optimization with regard to division of the array.

Maybe there is, but your seek method 'inspired' me to do something else.
I wrote another linear approach (seek2) which is a faster (at least for
the given NUMBERS) and than reimplemented it in c.

Here are the results:

                           user     system      total        real
bitset               78.203000   0.156000  78.359000 ( 78.937000)
inject-range         41.766000   0.375000  42.141000 ( 42.172000)
inject-array         17.046000   0.000000  17.046000 ( 17.062000)
inject-array-2       17.469000   0.047000  17.516000 ( 17.532000)
inject-array-map     16.938000   0.015000  16.953000 ( 17.093000)
to_ranges             0.437000   0.000000   0.437000 (  0.438000)
to_ranges_2           0.391000   0.000000   0.391000 (  0.391000)
to_ranges-map         0.484000   0.000000   0.484000 (  0.484000)
to_ranges_2-map       0.453000   0.016000   0.469000 (  0.469000)
seek                 33.125000   0.016000  33.141000 ( 33.187000)
seek2                 8.860000   0.000000   8.860000 (  8.860000)
seek_c                0.078000   0.000000   0.078000 (  0.078000)

And thats realy bad. It's five times faster than our best
algorithm in ruby - and its a realy braindead implementation:

ruby:
---------------------------------------------------------------
p1, p2, d1, last = 0, 0, NUMBERS[0], NUMBERS.length - 1
ranges = []
while (p2+=1) <= last
   if (d2=NUMBERS[p2]) - d1 != p2 - p1
     ranges << (d1..NUMBERS[p2-1])
     p1, d1 = p2, d2
   end
end
ranges << (d1..NUMBERS[last])
---------------------------------------------------------------

and the same thing in c:

---------------------------------------------------------------
VALUE seek_c(VALUE self, VALUE arr)
{
   RArray* a = RARRAY(arr);
	
   int p1 = 0;
   int p2 = 0;
   int d1 = FIX2INT(a->ptr[0]);
   int d2 = 0;
   int last = a->len -1;

   VALUE ranges = rb_ary_new();

   while ((p2+=1) <= last)
   {
     if ((d2=FIX2INT(a->ptr[p2])) - d1 != p2 - p1)
     {
       rb_ary_push(ranges, rb_range_new(INT2FIX(d1), a->ptr[p2-1], 1));
       p1 = p2;
       d1 = d2;
     }
   }
   rb_ary_push(ranges, rb_range_new(INT2FIX(d1), a->ptr[last], 1));
   return ranges;
}
---------------------------------------------------------------

this is more than a hundred times faster, ruby needs a jit engine.

Simon