2011/6/23 Robert Klemme <shortcutter / googlemail.com>:
> I'd also not use recursion in srv_entries_randomize() - a loop is
> usually more efficient.

I was not able to do it with a loop, neither writing the algorithm in
a paper, I always got bad statistical results. Are you sure sure that
your code generates statistical results? for example, with my example
data:

priorities = {
  1 => [[0, :"server-1"]],
  2 => [[16, :"server-2-A"], [4, :"server-2-B"], [8, :"server-2-C"]],
  4 => [[50, :"server-3"]]
}

I get there correct results (columns mean position from 1 to 5):

Iterating 50000 times...

Results:
-------------------------------------------------------------------
server-1:     50000         0         0       0         0
server-2-A:         0  28488  16302  5210         0
server-2-B:         0    7226  12245 30529        0
server-2-C:         0  14286  21453 14261        0
server-3:            0         0         0        0 50000
---------------------------------------------------------------------



> And btw. you calculate the total weight every
> time the method is invoked as sum of all entries while I maintain the
> @total and adjust it only for every insertion and removal.

Right. I'm trying to improve that. However take into account that my
code does not need to create an instance. Instead it will be a class
method (or a module method like DNS::srv_randomize), so I cannot use
attributes (or I should not).

But you are right, I must get removing the recursion. If you can prove
me that your code gets same results for 10000 iterations with same
input data I will adapt my code :)



> I notice you have a require 'benchmark' in there but I don't see any
> Benchmark methods used...

Initially I used it. Later I just do a "start_ime = Time.now" and so.


>You also seem to have the habit of placing
> assignments in method argument lists or control flow statements. This
> makes code harder to read and is really only needed in case of loops,
> e.g.
>
> while (str = io.gets)
> printf "We have read: %p\n", str
> end

Right, in fact I did it due to performance reasons, to avoid double
access to the same element of a hash, but I've realized that depending
on the case, it's just more efficient to perform double access rather
than generating a new variable.




> There is one thing I don't understand in your code: you have two
> randomizations in there: in line 8 there is rand() similar to what I
> have done and in line 34 there is shuffle. Why do you do that? Is
> there a requirement that hasn't been mentioned yet?

You are right, sorry. If a SRV record has weight 0, there should be no
chance it to be the chosen first (before other records with same
priority and weight greater than 0). So what I do is remove SRV
records with same priority and weight 0 and then make a simple shuffle
with them, adding the results in the last position. For example:

  - priority 1, weight 10, domain "server1", port 5060
  - priority 1, weight 0, domain "server2", port 5060
  - priority 1, weight 0, domain "server3", port 5060

In this case, records 2 and 3 should always be chosen after record 1
(which has weight > 0). The order of records with weight 0 must be
random.



Really thanks a lot for your interest. It's very helpful.





-- 
Iaki Baz Castillo
<ibc / aliax.net>