In message <4635948C.5050803 / blackkettle.org>, Alex Young writes:
>Peter Seebach wrote:
>> In message <46358854.30705 / cesmail.net>, "M. Edward (Ed) Borasky" writes:
>>> Welll ... I do this in Excel all the time -- add a column of random
>>> numbers to the right of the (CSV) file I want to shuffle, sort it about
>>> five times, hitting F9 before each sort to change the random numbers,
>>> and then deleting the random column and saving the file. Translate that
>>> to Ruby and I think you've got it.

>> It seems expensive.

>That's almost exactly what sort_by{rand} does...

Right.

And so far as I can tell, sort_by{rand} is inefficient compared to
a canonical shuffle algorithm.  That'd make sense; it's going to do a lot
of compares, and some fair number of them (I'd guess half, or close to
it?) result in swaps.  A shuffle does N swaps, or 2N assignments.  A
sort might do more.  Sorting a list of 10,000 items, sorted by rand,
performed about 128,000 comparisons.  If half of them result in swaps,
that's 64k swaps.

I can't think of any way in which sort_by{rand} is going to be "more random"
than a correct shuffle algorithm (yes, I know the one I posted the first time
is not that correct algorithm; I didn't do swaps, just assignment).

I also can't see an immediate reason to do the random numbers sort more
than once; I guess to reduce the chances of collisions preserving order.

-s