In message <42B06AD0.3080904 / ieee.org>, Steven Jenkins 
<steven.jenkins / ieee.org> writes
>It's been a long time since I was involved in one, but I'm reasonably 
>confident that we use "standard" benchmarks for large procurements.

I think some people have lost sight of what "benchmark" means. For 
computer apps some people have been claiming its TPS, MIPS or whatever 
form of throughput they are proposing. However, take a step back and 
think about "benchmark" in more general terms and you get a better idea 
for what a benchmark is. This is what Steven Jenkins was identifying 
with his satellite TCP/IP benchmark.

A benchmark is something, anything by which you can compare. Typically 
it is the best of breed at some point or other. Here is an example:

I play various musical instruments, one of them being the Border Bagpipe 
made by Jon Swayne. Jon Swayne is a legend in his own lifetime to many 
dancers and many musicians in the UK. For dancers it is because he is 
part of Blowzabella, a major musical force in social dancing throughout 
the last 25 years. For musicians, and particularly bagpipers, it is 
because he took the bagpipe, an instrument known for not typically being 
in tune, and if it was, not necessarily in tune with another bagpipe of 
the same type (or even by the same maker!) and creating a new standard, 
a new benchmark, if you will, by which other bagpipes are judged. Its 
not just Jon Swayne, there are some other makers, but they changed 
everyones perception and his pipes are the benchmark by which others are 
judged (yes, they really are that good). When you talk to pipers in the 
UK and mention his name there is a respect that is accorded. You don't 
get that without good reason. Anyway I digress.

The benchmark for Steven's satellite test was did it match the 
round-trip criteria. I think absolutely Steven's example is a benchmark. 
Its much looser than other benchmarks, but thats not the point. The 
point is did it serve a purpose?

For other people the benchmark will be does it perform the test within a 
given tolerance? For other people it may be how much disk space does it 
use? or is the latency between packets between X and Y? For other people 
it will be is it faster than X?

Where Austin's point comes in is that he points out the latter test is 
meaningless because you are comparing apples with oranges, when you 
should really be comparing GMO engineered (optimized) apples with GMO 
(optimized) oranges to be even getting close to a meaningful test. Even 
so you are still comparing cores to segments and it gets a bit messy 
after that, although they both have pips.

Even so, I once worked for a GIS company (A) that wrote their software 
in C with an in-house scripting language. We  won the benchmarks when in 
competition with other GIS companies. The competition won because of 
clever marketing. Their customers lost (*) though because the 
competitors software was too hard to configure and our marketing people 
were not smart enough to identify this and inform the customer of the 
problem.

What sort of benchmarks were being tested?
o Time to compute catchment area of potential customer base within X 
minutes drive given a drive time to location.
o Time to compute catchment area of potential customer base within X 
minutes drive given a drive time from location.
o Time to compute drive time to location of potential customer base 
within X minutes drive given a particular post code area.
o Time to compute drive time from location of potential customer base 
within X minutes drive given a particular post code area.
o Think up any other bizarre thing you want.

Times to and from are/location may not be the same because of highway 
on/off ramps, traffic light network delay bias and one-way systems. 
Superstores often don't care much about drive time from, but care a lot 
about drive-time to. For example drive time from may be 15mins, but 
drive-time to may be only 5mins.

As you can see the customer requirements are highly subjective, but the 
raw input data is hard data - maps and fixed road networks. The 
computing time etc, thats also a fixed reality given the hardware.

Its all about perception and need.

I think the benchmarketing term is quite apt for most benchmarks.

....and Steven, your story was great. I could really relate to a lot of 
that.

Stephen

(*) Its a matter of debate, they also used an in-house language and 
finding non-competitor engineers that used the language was nigh on 
impossible and thus they were very expensive to hire to do the 
configuration. Our (A) stuff was not so configurable, but didn't need to 
be.

When were we doing this stuff? 90..94 for me. X11 and Motif was the cool 
stuff back then.
-- 
Stephen Kellett
Object Media Limited    http://www.objmedia.demon.co.uk/software.html
Computer Consultancy, Software Development
Windows C++, Java, Assembler, Performance Analysis, Troubleshooting