Philip Rhoades wrote:
> People,
>
> The "how can we make a ruby compiler" thread has been very interesting 
> - I really like hearing from serious and competent programmers about 
> the theoretical problems involved with this issue.
>
> William Rutiser asked for an expansion on the details of my C/C++ 
> population genetics simulation program as a specific example of how 
> one might proceed depending on a particular situation.  I am happy to 
> elaborate - not least because it would be good to get input from 
> experienced Ruby programmers before I just try to replicate the same 
> program in Ruby - I'm sure there will be more sensible/efficient ways 
> of doing things than what I would attempt first off and so comparisons 
> between my C/C++ version and a dodgy Ruby program might be even more 
> unfair . .
>
> I will summarise the C/C++ program as it exists now (it has gone 
> through a number of versions and has a lot of code that does not need 
> replicating for the present comparison) with a general overview (I can 
> add more detail later if people are interested):
>
> - A population is represented by a number of sub-populations which 
> occupy cells of a two dimensional array
>
> - each cell has pointers to three ordered lists - representing the 
> parental population, the offspring population and a temporary 
> population and each element of a list represents an individual
>
> - at each generation (of potentially hundreds), the whole array is 
> iterated through, new offspring are produced, migrants move to 
> adjacent cells, parents die off etc
>
> As well as this main simulation program, I have already replaced all 
> the original shell scripts and some of the statistical processing with 
> Ruby scripts but the main simulation program is where I couldn't 
> afford an order or two increase in running times by rewriting in Ruby.
>
> The main problem I see with some of the Ruby conversions that I have 
> looked at (eg RubyInline) is that the performance problem comes in in 
> repeating the WHOLE simulation with different starting parameters 
> which is done thousands of times - so it is not like you have a single 
> recursive algorithm which is a bottleneck that can be optimised or 
> rewritten in C or something.  There are lots of little steps that 
> happen millions of times . .
>
> I hope that is sort of clear?
>
> Regards,
>
> Phil.
Some questions with my guesses at possible answers:

How much data is carried for each individual? a few alleles? a lot? the 
whole human genome?

What is done in the innermost loop? I imagine you select a pair of 
individuals, cross them, etc to produce more individuals. Or perhaps an 
individual dies or migrated to another population.

What do you do to run a new experiment? generate the initial population, 
specify the rules for combination, depth, migration, etc? gather data 
and process it at the end?

How much code do you need to write or rewrite for an experiment? How 
many parameters are involved?

Can you give some examples of aspects of the C++ program that you hope 
to improve by moving the sim to Ruby?

Can you say something about the context of your modeling? How is your 
simulator different from others used in the field?

-- Bill