Sir Hal,

I may be late by the time this gets posted. If so, I apologize...

(BTW: Thanks for The Ruby Way too... a great book).

My thoughts on this were that a frequency integer represents an interval
/ range into an expanded set of the given elements repeated given
element frequency times.

Eg. Given a composed array of values and weights

set = [ ["A",2], ["B",1] , ["C",6] ]

# C is 6 times more likely than B
# C is 3 times as likely as A
 
# The full population frequency array would be

expanded_set = set.collect{|e| [ e[0] ] * e[1] }.flatten 
# [ "A", "A", "B", "C", "C", "C", "C", "C", "C" ]

weighted_random_element = expanded_set[ rand( expanded_set.length() ) ]
# "C"

{I checked this by histogram over 10000 trials and the frequencies were
converging with expected values}

But it might be pretty expensive (in space, at least) for large elements
and weights, especially if the weights can't be scaled by their greatest
common factor.  

Eg. For a random number between 0 and the sum of all weights, count
along the sets until the number minus the sum of all processed element
weights is zero. Return the set element on which this occurs.

But without expansion (at least of the indexes into the original array),
how do you map backwards from the frequency? Any ideas?

Hope this helps,

Lachlan Pitts


-----Original Message-----
From: Hal E. Fulton [mailto:hal9000 / hypermetrics.com] 
Sent: Saturday, 29 March 2003 4:12 PM
To: ruby-talk ML
Subject: Weighted random selection -- how would you do this?


Here's a little question for you.

Given: A set of items, each with
an associated integer representing
the relative likelihood that each
item will be selected.

Write: A method that will select
a random item.

I've started this a couple of times,
but have become convinced I'm making
it too hard, and someone can probably
come up with a three-liner.

Any ideas?

Cheers,
Hal