On Mon, Jul 18, 2005 at 02:48:34AM +0900, Bill Kelly wrote: > My method of "finding the next slot" in a loop on a collision > may be overly cheesy - I don't know. I did consider just > asking for a new sample until finding one that didn't collide > (I like Joost's "dumping the values in and checking the > length" approach.) However I've tended to avoid that approach, > since when the number of samples desired approaches the > upper_bound, e.g. 5_000_000 5_000_000 worst case, I've been > burned in the past by such algorithms working very hard to find > the last few empty slots. However, it clearly wasn't a problem > for the 5_000_000 1_000_000_000 parameters for this quiz - I > just wanted to explain why I did it differently. I thought about that too, but I chose to ignore the problem because: a) I assumed that if you want random samples from a set, you generally want much less samples than the size of the total set. b) As soon as samples.size > total.size / 2 it's trivial to turn the algorithm on its head (i.e. randomly select items that you'll skip, then loop through the set and print all items that aren't selected). I did a few runs that suggest that sample.size == total.size / 2 is relatively slow on my implementation, but not impractal. Your milage may vary if you use a crappy random number generator, ofcourse :-) Joost.