Well, I wrote a binary search in c, added to the array class, and rebuilt
ruby. I am surprised this was so easy. As it turns out, the c version was
only 2 seconds faster. The pure Ruby version took 40 seconds to insert 10K
rows into my btree (also pure ruby code). The c version of the binary
search took 38 seconds to insert the same 10K records. Impressive, I
think.
However, I was hoping to get a little better performance. I notice that
most of the time is spent in the bsearch code and I was hoping to improve
it's performance. I expected I/O to be the bottleneck. My CPU is running
at ful speed as the rows are being inserted. I am running Red Hat 6.2 on a
dual Pentium 233 processor with SCSI Ultra Wide HDs.
In case anyone is interested, here is the binary search code I used.
Please keep in mind that I need this function to return the index of the
first element that is equal to or higher than "value". That is why I
called it bsearch2. The reason for the extra loop (l.upto(h)) at the
bottom is because it performs better for a small number of elements in the
array. In other words, the algorithm below performs faster than a pure
binary search. Actually numbers like "while(h-l>7)" perform even better
when "value" is an integer. I kept the number at 3 because I suspect the
performance would be worse for other types of objects because cost of the
compare will higher.
class Array
def bsearch2(value) # return the index of value or the first element >
value, assumes the array is sorted
h = self.length-1
l = 0
while (h - l > 3) do
m = (h+l)/2
if (value <= self.at(m))
h = m - 1
else
l = m + 1
end
end
l.upto(h) {|m| return m if (value <= self.at(m))} # find our place
in this page
return (h + 1) if (h < self.length-1) # this catches the one case
where we adjusted h from a value that could have worked
return -1 # no element > than value
end
end
Here is the C-code I added to array.c for both the search I needed and the
a true Binary search (in case someone would like to integrate them into the
language):
static VALUE
rb_ary_bsearch(ary, val)
VALUE ary;
VALUE val;
{
long h, l, m;
VALUE v;
h = RARRAY(ary)->len - 1;
l = 0;
while (l <= h) {
m = (l + h) / 2;
if (FIXNUM_P(val) && FIXNUM_P(RARRAY(ary)->ptr[m])) {
v = val - RARRAY(ary)->ptr[m];
}
else if (TYPE(val) == T_STRING && TYPE(RARRAY(ary)->ptr[m]) ==
T_STRING) {
v = rb_str_cmp(val, RARRAY(ary)->ptr[m]);
}
else {
v = rb_funcall(val, cmp, 1, RARRAY(ary)->ptr[m]);
}
//printf ("\n bsearch data - val=%d, ary[m]=%d, l=%ld, m=%ld, h=%ld,
v=%d\n", FIX2INT(val), FIX2INT(RARRAY(ary)->ptr[m]), l, m, h, FIX2INT(v));
if (FIX2INT(v) < 0) {
h = m - 1;
}
else if (FIX2INT(v) > 0) {
l = m + 1;
}
else {
return INT2NUM(m);
}
}
return Qnil;
}
static VALUE
rb_ary_bsearch2(ary, val)
VALUE ary;
VALUE val;
{
long h, l, m;
VALUE v;
h = RARRAY(ary)->len - 1;
l = 0;
while (h - l > 2) {
m = (l + h) / 2;
if (FIXNUM_P(val) && FIXNUM_P(RARRAY(ary)->ptr[m])) {
v = (val - RARRAY(ary)->ptr[m]);
}
else if (TYPE(val) == T_STRING && TYPE(RARRAY(ary)->ptr[m]) ==
T_STRING) {
v = rb_str_cmp(val, RARRAY(ary)->ptr[m]);
}
else {
v = rb_funcall(val, cmp, 1, RARRAY(ary)->ptr[m]);
}
//printf ("\n bsearch2 data - val=%d, ary[m]=%d, l=%ld, m=%ld,
h=%ld, v=%d\n", FIX2INT(val), FIX2INT(RARRAY(ary)->ptr[m]), l, m, h,
FIX2INT(v));
if (FIX2INT(v) <= 0) {
h = m - 1;
}
else {
l = m + 1;
}
}
for (; l <= h; l++) {
if (FIXNUM_P(val) && FIXNUM_P(RARRAY(ary)->ptr[l])) {
v = (val - RARRAY(ary)->ptr[l]);
}
else if (TYPE(val) == T_STRING && TYPE(RARRAY(ary)->ptr[l]) ==
T_STRING) {
v = rb_str_cmp(val, RARRAY(ary)->ptr[l]);
}
else {
v = rb_funcall(val, cmp, 1, RARRAY(ary)->ptr[l]);
}
//printf ("\n bsearch2 data2 - val=%d, ary[l]=%d, l=%ld, m=%ld,
h=%ld, v=%d\n", FIX2INT(val), FIX2INT(RARRAY(ary)->ptr[l]), l, m, h,
FIX2INT(v));
if (FIX2INT(v) <= 0) {
return INT2NUM(l);
}
}
if (h < RARRAY(ary)->len - 1) {
return INT2NUM(h + 1);
}
return INT2NUM(-1);
}
-----Original Message-----
From: Michael Davis [SMTP:mdavis / sevainc.com]
Sent: Monday, February 19, 2001 11:36 PM
To: 'ruby-talk / ruby-lang.org'
Subject: RE: [ruby-talk:11154] Re: Array.insert_at
Thanks for the quick response. Nice solution! I am surprised that I
overlooked this in the book. Is there a binary search function for sorted
arrays? My attempts to create one are proving to be slow.
-----Original Message-----
From: Yukihiro Matsumoto [SMTP:matz / zetabits.com]
Sent: Monday, February 19, 2001 10:29 PM
To: ruby-talk ML
Subject: [ruby-talk:11154] Re: Array.insert_at
Hi,
In message "[ruby-talk:11153] Array.insert_at"
on 01/02/20, Michael Davis <mdavis / sevainc.com> writes:
|Has anyone created a method called insert_at(index, value) for the Array
|class? I would like this to expand the array and insert value at position
|x.
a = [1,2,3,4,5]
a[2,0] = [12,13]
matz.