--------------030707030609070201020205
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

> by Martin DeMello
> 
> In Scrabble parlance, a 'bingo stem' is defined as a set of six
> letters that combines with a large fraction of the alphabet to anagram
> to valid seven letter words (a 'bingo' is a move using all seven tiles
> on your rack). For instance, one of the more prolific stems, SATIRE,
> combines with twenty letters, A, B, C, D, E, F, G, H, I, K, L, M, N,
> O, P, R, S, T, V, W (forming words like ASTERIA, BAITERS, RACIEST
> ...).
> 
> Write a program that, given a word list and a cutoff n, finds all 6
> letter stems that combine with n or more letters, sorted in order of
> how many distinct letters they combine with.

Here's my solution.  My earlier performance problems resulted from 
several very inefficient attempts at generating unique letter 
combinations, i.e. list all unique sets of "N letters taken 6 at a 
time".  Reducing the problem to "7 letters taken 6 at a time" made that 
part trivial, but I kept plugging at the more general problem.

The dictionary I used was 2of4brif.txt, available as part of the 12Dicts 
package at http://prdownloads.sourceforge.net/wordlist/12dicts-4.0.zip

On my 2GHz P4, I get 5319 stems with a cutoff of 3 in 4.062 seconds.  A 
version optimized for 7-letter words (not shown) takes barely over 2 
seconds.

If I swap line 47 in for line 48, so that all words of 7 letters or more 
are processed, I get 167,227 stems with a cutoff of 3 in 2196.453 
seconds (~37 min.).

-- 
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>

--------------030707030609070201020205
Content-Type: text/plain;
 nameingo1.rb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filenameingo1.rb"

#!/usr/bin/env ruby

class Combiner
  include Enumerable
  
  def initialize(pick, from)
    # How many objects should be returned in each set
    @pick  ick
    # List of objects from which to produce combinations
    @from  rom
  end
  
  def each
    unpicked  from.length - @pick
    # Indices of objects in @from to return as a set.
    selectors  0...@pick).to_a
    loop do
      yield selectors.collect {|i| @from[i]}
      # Advance the selectors array to the next combination.
      s  pick - 1
      s -  while selectors[s] s + unpicked
      break if s < 0
      value  electors[s]
      s.upto(@pick - 1) do |i|
        selectors[i]  value + )
      end
    end
  end
end

class BingoStems
  STEMLENGTH  
  
  def initialize(filename, cutoff)
    @filename  ilename
    @cutoff  utoff
    @stems  }
  end
  
  def find
    start_time  ime.now
    # Collect stems for each word in the word list
    File.open(@filename) do |file|
      file.each do |word|
        word.chomp!
        # Use the next line alternative to process longer words.
        #next if word.length < TEMLENGTH
        next if word.length ! STEMLENGTH + 1)
        word.downcase!
        Combiner.new(STEMLENGTH, word.split(//).sort).each do |stem|
          (@stems[stem.join('')] || })[word]  rue
        end
      end
    end
    # Discard stems with less than @cutoff word combos
    @stems.delete_if {|k, v| v.length < @cutoff}
    puts "found #{@stems.length} stems in #{Time.now - start_time} seconds"
    self
  end
  
  def report(stream  stdout, verbose  alse)
    keys  stems.keys.sort_by {|a| @stems[a].length}
    keys.each do |key|
      stream.puts "#{key}: #{@stems[key].length}"
      next unless verbose
      @stems[key].each_key do |word|
        stream.puts "   " + word
      end
    end
    stream.flush
  end
  
end

CUTOFF  RGV[0].to_i
DICT  RGV[1]
BingoStems.new(DICT, CUTOFF).find.report(File.open('output.txt', 'w'), true)

--------------030707030609070201020205--