On Tue, 6 Dec 2005, Mauricio [iso-8859-1] FernáÏdez wrote:

> On Sun, Dec 04, 2005 at 10:21:02AM +0900, ara.t.howard / noaa.gov wrote:
>> On Sun, 4 Dec 2005, Mauricio [iso-8859-1] FernáÏdez wrote:
>>> sort_by{ rand } is actually biased, since sort_by will preserve the
>>> relative order of the elements for which rand() returned the same value.
>
> I did some numbers and found the following probability estimates:
>
> array size  P(#rand() collision)
> 1000       5.54558e-11
> 10000      5.55056e-09
> 1000000    5.55096e-05
> 10000000   5.53574e-03
>
> A collision implies that the associated pair of elements has got the same
> ordering in the output array, instead of 50-50 chances of it being reversed.
>
> More info, including the Ruby code I used, available at
> http://eigenclass.org/hiki.rb?sort_by+rand+is+biased
>
>> does
>>
>>   %w( a b c d ).sort{ rand <=> rand }
>>
>> help?
>
> I'm not totally sure it's really unbiased, but at first sight it looks good.

the big difference is that a value will have rand called multiple times for it
- in otherwords an element compareds differently every time.  this is bound to
help shuffle more uniformly it seems:

   harp:~ > cat a.rb
   %w( a b c d ).sort{|a,b| ra = rand; rb = rand; p a => ra; p b => rb; ra <=> rb }

   harp:~ > ruby a.rb
   {"a"=>0.181439380218727}
   {"c"=>0.579403886629126}
   {"c"=>0.713513989939958}
   {"d"=>0.573687508540169}
   {"a"=>0.851534740906118}
   {"d"=>0.00156074151427565}
   {"b"=>0.803591007691647}
   {"a"=>0.494066110872563}
   {"b"=>0.834676789626288}
   {"c"=>0.792106134460754}

so, note that "a" was sorted using a different value each time.

can anyone who remembers stats and knows the sort algorithm weigh in?

-a
-- 
===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| all happiness comes from the desire for others to be happy.  all misery
| comes from the desire for oneself to be happy.
| -- bodhicaryavatara
===============================================================================