Stephen Kellett wrote: > In message <1116294726.4747.30.camel / localhost.localdomain>, Zed A. Shaw > <zedshaw / zedshaw.com> writes > >> The first thing is that there's not statistical basis for "1000 times". > > There is. The error is smaller. If you don't believe me you need to > examine why pollsters always ask at least 1000 potential voters their > opinion. The error rate is +/- 3% with a sample size of approx 1000 > voters. Ask 10 people and predict the election result and your error > will be much greater than 3%. The pollsters are in it to make money > predicting outcomes. If they could get away with 5 or 10 samples, they > would. It would be more profitable. They don't do it that way. True (mostly), but irrelevant. Those statistics apply to problems of estimating proportions, but this isn't one. Characterizing performance of systems like this can expressed as a simple linear regression problem: t = a + bx + e where t = runtime a = fixed overhead (startup, teardown, etc.) b = runtime per 'size' unit x = size of request or returned data e = random error Choose N values of x and observe their corresponding t values. Estimate a and b using standard regression techniques. The "goodness" (i.e., the variance) of the estimates of a and b depends on the variance of e and the value of N. If var(e) is small, you can get good estimates of a and b with small N. In particular, if var(e) = 0, you can get perfect estimates of a and b with N = 2. If I needed 1000 samples to get good estimates of performance of an information system, I'd stop trying to overcome that with large numbers and figure out why randomness plays such a large role in the performance of my system. Steve