Robert:
> > Ok, so we can't hope for anything better than 
> > secs/millisecs without going
> > platform-specific then. Point taken on relying on anything 
> > less than secs.
> > However I'm not relying on them for any critical func; I'm 
> > writing some performance tests.
Dave:
> I'd guess your best bet would be to use the benchmark module and call
> your code in a loop a couple of million times. You'll need to write a

These comments may be too simple, but I wanted to share them anyway.

I'd say Dave's way is good idea even with microsecond timing capability as
most modern operating systems give processes unspecified and not guaranteed
slices of time for execution (even DOS :). So your friend is averaging and
loads of samples. Being able to use microseconds would cause smaller sample
base to be reliable estimate, but probably not much smaller. Anyway, one can
get quite good estimates about code performance fast enough; when comparing
running time to the time how long it took to write the test in the first
place :).

If the code has anything to do with hardware it's order of magnitude harder
to get accurate estimates (some estimates one gets easily, yes, but accurate
:). And one shouldn't forget the memory (and caches) is hardware too. For
example, there will be dramatic change of execution speed when the inner
loop with all used data happens to fit into cache.

Maybe this point is trivial, but when working with Ruby there's additional
element of fuzzyness in measurements: Ruby itself. In the example:

start = Time.now
my_ext_function(args)
elapsed_time = Time.now - start

The latter call (as the first too) Time.now can take (in theory) different
amount of time when it's executed. The reason is quite simple:

1) Time.now is some C-code calling:
    obj = Data_Make_Struct(klass, struct time_object, 0, free, tobj);
2) Data_Make_Struct is a macro, turning into
   #define Data_Make_Struct(klass,type,mark,free,sval) (\
        sval = ALLOC(type),\
        memset(sval, 0, sizeof(type)),\
        Data_Wrap_Struct(klass,mark,free,sval)\
   )
3) ALLOC is a macro, turning into
   #define ALLOC(type) (type*)xmalloc(sizeof(type))
4) xmalloc is a macro calling ruby_xmalloc which has code like
    if (malloc_memories > GC_MALLOC_LIMIT) {
	rb_gc();
    }

The point is that in general when we can't predict the memory layout, we
can't say how long garbage collection (rb_gc) will take and if it will
happen.

	- Aleksi