On Mon, 2005-05-16 at 20:00 +0900, Stephen Kellett wrote:
> In message <1116192218.8384.11.camel / localhost.localdomain>, Zed A. Shaw
> <zedshaw / zedshaw.com> writes
> >For the people who can't or won't read, the test is informal and shows
> >that Ruby/Odeum is about 10 times faster when doing a search.
> 
> You should be able to compare the Ruby/JVM startup times by writing
> minimal apps for each that are effectively
> 
>         void main()
>         {
>         }
> 
> Run each 1000 times and compare.
> 
Actually, I have a confession to make in that I anticipated this and set
a trap. :-)

The first thing is that there's not statistical basis for "1000 times".
You actually want to run the test several times in a series of sample
runs and then determine the common ramp-up time from a cold start.
Otherwise you'll never know if the few times you ran your "1000 times"
test were just flukes or not.

The second thing is that your simple main() for both systems actually
isn't the "start-up time" since there is complexity in the class loader,
hotspot JIT compilers, Ruby source translation, etc.  All you are
testing is the time it takes to load your one little main function.

The actual way to test without the JVM and Ruby start-up times is to do
the timing inside the JVM rather than outside.  In other words, have a
test case that just runs 1000 times and measure either the total time to
do the one run, or average and standard deviation of each measurement.
Again, when you do the test this way you have to figure out the common
ramp-up time for the system so that you can remove them from the test
case as outliers later.

But, of course all of this would take way too long. I'll let Lucene
folks go through that pain if they feel the need. :-)

> I think 5 times is far to few when you are relying on the OS to load
> stuff etc. You should discard the first time, as all subsequent times
> will most likely bring your DLLs/SOs from cache.
> 
Your right in a way, but your idea that it is only the "first time"
isn't quite right as the ramp-up period can vary between runs.

FYI, I did the mean of 5 samples after running a few to get rid of
ramp-up.  I just "eye-balled" the ramp-up, so don't quote me on the
validity at all.

Also, there's solid statistics behind only doing a few samples, but I
didn't use any of those techniques.  I believe entire industries have
been founded on papers with only 3 samples. :-)

> For what its worth, on my 1GHz Athlon Windows XP box when I run Java
> Performance Validator and Ruby Performance Validator, I get the
> impression that Ruby startup time is longer than JVM startup time. But
> then again there is all the time of the injected stub from JPV/RPV as
> well, and may the RPV stub is taking longer not Ruby.
> 
Interesting.

> If Ruby startup time is longer than JVM startup time, that means you are
> doing an even better job than you thought :-). This wouldn't surprise me
> as the Java String class is not built in, its JIT'd, whereas in Ruby the
> string support is builtin.
> 
I don't know, the JVM JIT really punishes command line tools to death,
and it's such a pain to turn it off.  JIT rocks for long running
processes, but a test like this is probably being seriously punished.
That's why I was a bit sheepish about the 10 times faster claim without
specifically saying that I wanted to include start-up time since I'm
writing a CLI tool.

Thanks for the comments.

Zed