--Apple-Mail-1-926908554
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=US-ASCII;
	delsp=yes;
	format=flowed

Hi all-

Here is my first attempt at a Ruby Quiz  (attached file:   
'markovtext.rb').  I'm new to Ruby, mostly paying the bills over the  
past years with Perl work.  This quiz was fun, and I learned a lot  
from doing this.  Coming from Perl, I often rely upon auto- 
vivification, so I needed to figure out how to work around this.   
Perhaps there are improved ways of going about this, and I'd  
appreciate any feedback on how to go about it better.

My solution is fairly straight forward, using a second order word  
chain, which I imagine many others will be implementing in similar  
ways.  As noted in the comments, I only start making my paragraph on  
the second sentence, in order to get greater diversity in how the  
paragraph starts.  I'm also striping out some of the punctuation from  
the end result, as some texts give orphaned punctuation marks.

I had the most fun from the quiz running my script against various  
texts, and seeing the results.  My favorite has been getting results  
from "Huckleberry Finn", though the grammar gets a bit off at times,  
but that relates to the input text.  For comparison, one can see much  
more typical sentence structure from "Journey to the Center of the  
Earth".

Example results:

==

Huck Finn via 'markovtext.rb':

And above all, don't you try to run in the dark he says: Yes, Mars  
Sid, A dog. Cur'us dog, too. Does you want it, you can go and write  
poetry after them out loud. One bill said, The celebrated Dr. Armand  
de Montalban, of Paris, would lecture on the floor and tied up the  
bank. I couldn't help it, and she snatched a kiss of him, and never  
think about it; and I went along up the smoothness and the judge for  
him, being a mystery, and he'd cipher out a speech, all full of it,  
and I would kill me, dey sk'yers me so.

==

Journey to the Center of the Earth via 'markovtext.rb':

I gave way to the south-east. We have passed through the treatises of  
Cassanion, and all his time at Altona, staying with a translation?  
This IS the Icelandic Professor. At this rate we shall see. So says  
the Professor, no doubt, was pursuing his observations or taking  
notes, for in three days we must go back to the dull thuds of the  
country might be held up by the descent commenced. I can but try  
Spanish, French, Italian, Greek, or Hebrew.

==

Cheers,
-albert



--Apple-Mail-1-926908554
Content-Transfer-Encoding: 7bit
Content-Type: text/x-ruby-script;
	x-unix-mode=0644;
	name="markovtext.rb"
Content-Disposition: attachment;
	filename=markovtext.rb

class String
  def wordwrap(len)
    # method for wrapping text, used for the output
    gsub( /\n/, "\n\n" ).gsub( /(.{1,#{len}})(\s+|$)/, "\\1\n" )
  end
end

class Markov

  def initialize(file)
    # open file, and make hash for storing the text
    @text = Hash.new
    read_text(file)
  end

  def create_paragraph(len)
    # method to generate a paragraph, once text has been read in
    # takes a length, which is the minimum number of words for the 
    # paragraph
    # array to hold the proximal words as scanning along
    scan = Array.new(2,"\n")
    @words = Array.new 
    # keep reading until length has been exceed, and end with a 
    # closing punctuation mark of '.' or '?' or '!'
    flag = 0
    while @words.length <= len || flag == 0
      flag = 0
      # select a random word which is preceeded by the previous two
      # words
      word = random_word(scan)
      # exit if hit end of previous text.
      break if word == "\n"
      flag = 1 if word =~ /[\.\?\!][\"\(\)]?$/;
      # only start slurping words once finish first sentence
      # otherwise the start of the text will always be the same.
      @words.push(word) if @words.length > 0 || flag == 1
      # shift the array to contain the previous two words for the next 
      # round
      scan.push(word).shift
    end
    # remove the first word.  a left over from the opening sentence
    @words.shift
  end
  
  def print_text
    # method to output the paragraph created.  '"' and '(' and ')'
    # are removed, as they are often orphaned in the output text.
    # no attempt is made to quote spoken text in this version.
    print @words.join(" ").gsub(/[\"\(\)]/,"").wordwrap(68)
  end
  
  private
  
  def read_text(file)
    # read the file
    File.open(file) do |f|
      # array to hold the preceeding two words while reading the text
      scan = Array.new(2,"\n")
      while line = f.gets
        line.split.each do |w|
          # call the method which adds the next word to the text hash
          add_text(scan[0], scan[1], w)
        # shift the array to contain the previous two words.
        scan.push(w).shift
        end
      end
      # add a return at the end to mark the end of the text.
      add_text(scan[0],scan[1], "\n")
    end
  end
  
  def random_word(scan)
    # select random word which is preceed by the previous 
    # two words
    index = rand(@text[scan[0]][scan[1]].length)
    return @text[scan[0]][scan[1]][index]
  end
  
  def add_text(a,b,c)
    # method which builds the text hash (a second order word chain)
    # first check whether key exists for first word
    # if so, then check whether keys exists for second word
    # build hash appropriately.  after hash-hash, then build
    # array which contains all words which are proceeded by the
    # previous two.
    if @text.key?(a)
      if @text[a].key?(b)
        @text[a][b].push(c)
      else
        @text[a][b] = Array.new(1,c)
      end
    else
      @text[a] = Hash.new
      @text[a][b] = Array.new(1,c)
    end
  end
  
end

# call script, telling file to process, and the minimum
# length of text to output.

if ARGV[1] == nil
   abort("Usage: markovtext.rb file length")
else
   file = ARGV[0]
   length = ARGV[1].to_i
end

text = Markov.new(file)
text.create_paragraph(length)
text.print_text
--Apple-Mail-1-926908554--