My hack at the substring problem (nice one, James) is based on suffix  
trees.

Suffix trees and suffix arrays and a great tool for substring  
operations. The idea is to create a list of all the possible suffixes  
in the original string. For "banana" this would be

banana
anana
nana
ana
na
a

Because the list contains an entry starting at every position in the  
string, we know that all possible substrings of the original string  
must occur at the start of one of these list elements. The substring  
"nan", for example, occurs at the start of the 3rd entry.

Now sort them

a
ana
anana
banana
na
nana

Now we know that all substrings that start with the same sequence of  
characters must occur at the start of items in the list that are  
adjacent. Searching through to list to find these is a linear operation.

A suffix array is basically this data structure. For efficiency,  
though, it doesn't store the actual strings. Instead it stores the  
offset of the start of the string (for our example, this would be [6,  
4, 2, 1, 5, 3]). You'll find a whole bunch of stuff on using suffix  
arrays for searching and string matching, particularly in the area of  
DNA sequencing.

It turns out that in Ruby, you don't always have to go to the trouble  
of constructing the suffix array. When you take a substring in Ruby,  
it doesn't copy the string data. Instead, it just constructs a new  
string object (just a few words of memory) that references into the  
original string. Only when either string is modified does the string  
content get copied. This means the code for constructing the suffix  
list is both simple and relatively efficient:

   def initialize(text)
     @suffix_list = []
     len = text.length
     len.times { |i| @suffix_list << text[i, len] }  # the ,len] is a  
hack...
     @suffix_list.sort!
   end

The only ugly part is the use of [i, len] to create the substring.  
Technically it should be 1..len, but that constructs a new,  
unnecessary range object. Using 'len' as the final index does the same  
thing, because the substring stops at the end of the input string, but  
it can be misleading to read.

The code to scan the suffix list looks at each adjacent pair, and sees  
if it can find a longer matching substring at the start of each that  
it has previously found. Because James disallowed overlapping  
substrings, you have to be careful to look at no more that 1/2 of the  
longest string. The basic loop looks like this:


     s1 = @suffix_list[0]

     1.upto(@suffix_list.size - 1) do |i|

       s2 = @suffix_list[i]
       max_possible = s2.length / 2   # stop them overlapping

       while  # quick sanity check - saves doing the more expensive  
substring if it fails
              s1[max_length_so_far] == s2[max_length_so_far] &&
              # stop strings from overlapping
              max_length_so_far < max_possible &&
              # brute force comparison
              s1[0,max_plus_one] == s2[0,max_plus_one]

         max_length_so_far = max_plus_one
         max_plus_one += 1
         found_at = i
       end
       s1 = s2
     end

The while loop has three conditions. The last one is the money earner:  
it looks for a match at the start of the two strings which is one  
longer that the longest match found so far. If it finds it, it records  
this as the new longest match, and then tries for a even longer one  
with the current pair.

The second condition on the while loop stops us considering more than  
1/2 the second string. Because our strings are sorted, if thereis  
overlap, the second string will always be the one that contains the  
overlapping segment of the first.

The first condition is just an optimization: it stops us doing the  
more expensive substring comparison if the last characters of the two  
substrings we're comparing don't match.

So, put it all together, and we have

# - - - - - - - - - - - - -

class CommonSeq

   def initialize(text)
     @suffix_list = []
     len = text.length
     len.times { |i| @suffix_list << text[i, len] }  # the ,len] is a  
hack...
     @suffix_list.sort!
   end

   def find_substrings
     max_length_so_far = 0
     max_plus_one      = 1    # save a little math in the loop
     found_at          = nil

     # Look at all adjacent pairs of suffices.
     s1 = @suffix_list[0]

     1.upto(@suffix_list.size - 1) do |i|

       s2 = @suffix_list[i]
       max_possible = s2.length / 2   # stop them overlapping

       while  # quick sanity check - saves doing the more expensive  
substring if it fails
              s1[max_length_so_far] == s2[max_length_so_far] &&
              # stop strings from overlapping
              max_length_so_far < max_possible &&
              # brute force comparison
              s1[0,max_plus_one] == s2[0,max_plus_one]

         max_length_so_far = max_plus_one
         max_plus_one += 1
         found_at = i
       end
       s1 = s2
     end

     if found_at
       suffix = @suffix_list[found_at]
       [max_length_so_far, suffix[0, max_length_so_far]]
     else
       nil
     end
   end
end

if __FILE__ == $0
   seq = CommonSeq.new(STDIN.read.chomp)
   puts seq.find_substrings
end

# - - - - - - - - - - - - -


Run times are shown for the Illiad and War and Peace:

   dave[tmp/seq] time ruby seq.rb <homer.txt
   168
    who are
   perishing and coming to a bad end. We will, however, since you so
   bid us, refrain from actual fighting, but we will make serviceable
   suggestions to the Argives
   ruby seq.rb < homer.txt  2.82s user 0.05s system 99% cpu 2.872 total


   dave[tmp/seq] time ruby seq.rb <war.txt
   63


   The Project Gutenberg Literary Archive Foundation has been
   ruby seq.rb < war.txt  12.49s user 0.17s system 99% cpu 12.737 total


Here's the somewhat basic set of tests I used


# - - - - - - - - - - - - -
require 'seq'
require 'test/unit'

class CommonSeq
   attr_reader :suffix_list
end

class TestSeq < Test::Unit::TestCase

   def test_basic_suffix_creation
     cs = CommonSeq.new("banana")
     assert_equal(%w{a ana anana banana na nana}, cs.suffix_list)
   end

   def test_empty
     cs = CommonSeq.new("")
     assert_nil cs.find_substrings
   end

   def test_length_one
     cs = CommonSeq.new("a")
     assert_nil cs.find_substrings
   end

   def test_length_two_no_match
     cs = CommonSeq.new("ab")
     assert_nil cs.find_substrings
   end

   def test_length_two_with_match
     cs = CommonSeq.new("aa")
     assert_equal [ 1, "a"],  cs.find_substrings
   end

   def test_length_three_no_match
     cs = CommonSeq.new("abc")
     assert_nil cs.find_substrings
   end

   def test_length_three_adjacent_match
     cs = CommonSeq.new("aab")
     assert_equal [ 1, "a"],  cs.find_substrings
   end

   def test_length_three_separated_match
     cs = CommonSeq.new("aba")
     assert_equal [ 1, "a"],  cs.find_substrings
   end

   def test_does_not_find_overlapping_match_length_one
     cs = CommonSeq.new("aaa")
     assert_equal [ 1, "a"],  cs.find_substrings
   end

   def test_does_not_find_overlapping_match_length_three
     cs = CommonSeq.new("aaaa")
     assert_equal [ 2, "aa"],  cs.find_substrings
   end

   def test_does_not_find_overlapping_match_length_two
     cs = CommonSeq.new("ababa")
     assert_equal [ 2, "ab"],  cs.find_substrings
   end

   def test_banana
     cs = CommonSeq.new("banana")
     assert_equal [ 2, "an"],  cs.find_substrings
   end
end
# - - - - - - - - - - - - -


I wouldn't be surprised if the idea of searching only 1/2 of the  
second string to prevent overlaps is wrong.. :)


Dave