```Brian Schröäer wrote:
> I tried to implement an algorithm that is perfect regarding the chunk size given the percentage information. This is described in
>
> http://ruby.brian-schroeder.de/quiz/detect_words/content/content.html
> or prettier
> http://ruby.brian-schroeder.de/quiz/detect_words/content.ps
>
> The problem is that this algorithm was not significantly better than always dividing into three parts. (Surprisingly it was even slower). Maybe my math is wrong there, if you detect the mind-bug it would be very interesting.

I believe that you oversimplified the problem, and therefore your math
is wrong (or, rather, a too coarse approximation, as your results prove).

N... Number of Words.
n... Number of banned words.
r:=P[word w is banned]=n/N
p:=P[slice of size x passes] = (1-r)^x.

Your formula for p ((1-r)^x) assumes that the probability r is constant,
while in reality it changes with every word you put into your slice.

The probability of a slice of size x passing is the same as the
probability that none of 100 words you randomly pick is banned.
Now, while you pick (you pick only non-banned words, because you
immediatly stop if you get a banned one), the ration banned/total
for the _remaining_ set constantly changes.

You can instantly see that your formula cannot be exact if you
try to calculate the probability of a slice of size N passing.
Your formulat gives (1-n/N)^N, which is probably quite small, but still
greater than zero. The exact probability is obviosly zero (assuming n > 0)

Additionally, the probability r changes everytime you find a non-banned
slice. If, for example, you test 500 out of 100 words, and the slice
passes, the being-banned probability of the rest of the words doubles.

greetings, Florian Pflug

```