* Mike Campbell (michael_s_campbell / yahoo.com) wrote:

> > There is also mathematically nice recipe for picking random line
> > from the file. In Perl at

> > http://www.oreilly.com/catalog/cookbook/chapter/ch08.html#chap08_picking_0
> > and in Python at
> > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/59865

> > It is ecological, by not reading entire file into the array of
> > strings. It is easy to write the same in Ruby.
>
> It is a nifty algorithm and works reasonably well, depending on how
> fast your rand() is.  I mention this only as a pedantic nit because it
> calculates a random number *on every line* of the file, which may or
> may not be fast enough for you.

The way I do this is something like:

File.open(aFile, 'r') do |f|
        tries = 0
        begin
                # seek to a random point in the file
                f.seek(rand(File.size(aFile)))

                # read to the next tagline
                do
                        line = f.readline
                until line == aSeperator

                tag = ''
                while (line = f.readline) != aSeperator
                        tag += "\n" + line
                end
        rescue EOFError
                # we hit EOF before getting a tagline.
                # either retry, or set tag = some default tagline
                tries += 1
                if tries < 3
                        retry
                else
                        tag = "Something funny"
                end
        end
end

It's much cheaper, involving generating a single random number, a single
seek, a single stat, and something like (average_tagline_length * 1.5)
+ 1 readline()s.  Almost certaily the better approach on a 300,000 line
tagline file than doing on average 150k readline()s and rand()s :)

-- 
Thomas 'Freaky' Hurst  -  freaky / aagh.net  -  http://www.aagh.net/
-
If computers take over (which seems to be their natural tendency), it will
serve us right.
		-- Alistair Cooke