Ruby Quiz wrote: > This quiz is to make a Hangman guessing player in Ruby. Play should proceed as > follows: I focused on building a program that makes good guesses. == Algorithm overview The guesser reads a dictionary and then builds a database (which is reused) with the following information about each word: * size * positions of each character * number of occurrences of each character The basic algorithm is as follows: * Remove all words that do not have a matching length. * While the game has not been solved: ** Pick the character included in the most words still remaining. ** If the character is not in the word: remove all words with the character. ** If the character is in the word: remove all words that do not contain the character at exactly the revealed positions. === Weaknesses The algorithm is not optimal. The character that's included in the most words is not necessarily the character which will give the largest reduction in number of potential words since position is not considered. Consider a dictionary with the following: bdc ebc fcb b and c are tied for number of occurrences, but b would be the better choice. If we pick b we will in all cases be left with one potential word. If we pick c and the word is one of the first two we get two potential words. My guess is that it's a good enough heuristic in most cases though. Also note that this program is built on the assumption that the word is picked randomly from the dictionary. More refined solutions could weigh in the relative frequency of different words in normal English text. == Speed It takes about 40 minutes to create the database for a dictionary with 4*10^5 words, but it only has to be created once. Computing all guesses for a word (i.e. from being given the length to having the correct word) takes about 30 to 40 seconds for a dictionary with 4*10^5 words. That time includes about 10 seconds to reset the database from previous uses, another 10 seconds for pruning based on word length and the rest for the remaining search. === Possible improvements Much of the initial sorting could be precomputed (e.g. split words into different table based on length and then only work against the table with the specified length) to cut down on the time needed reset and do the initial pruning. The first (and possibly some additional steps) could also be precomputed. == Dependencies Requires a mysql database and the mysql-gem. You need to enter your username, passwords and database name in HangmanGuesser#db_connection below. == The code #!/usr/bin/env ruby # == Synopsis # # automated_hangman: plays a game of hangman with the word of your # choice # # == Usage # # automated_hangman [OPTION] ... WORD # # -h, --help: # show help # # -d, --dictionary [dictionary location]: # sets up the database to use the specified dictionary (defaults to # /usr/share/dict/words), can take some time # # WORD: The word that the program should try to guess. require 'getoptlong' require 'rdoc/usage' require 'mysql' # Describes a game of hangman. class Hangman LIVES = 6 # Creates a new game of hangman where word is the target word. def initialize(word) @guesses = [] @word_characters = word.chomp.downcase.split(//) end # Returns an array containing the incorrect guessed characters. def incorrect_guesses @guesses - @word_characters end # Guesses a specified character. Returns an array of indices (possibly # empty) where the character was found. def guess(char_guess) @guesses << char_guess indices = [] @word_characters.each_with_index do |character, index| indices << index if character == char_guess end return indices end # Returns a string representation of the current progress. def to_s hidden_characters = @word_characters - @guesses return @word_characters.join(' ') if hidden_characters.empty? @word_characters.join(' ').gsub( /[#{hidden_characters.uniq.join}]/, '_') end # Checks whether the player has won. def won? (@word_characters - @guesses).empty? end # Checks whether the player has lost. def lost? incorrect_guesses.size > LIVES end # Gets the number of characters in the word. def character_count @word_characters.size end end # The guessing machine which picks the guesses. class HangmanGuesser # The location of the default dictionary to use. DICTIONARY_FILE = '/usr/share/dict/words' # An array of the characters that should be considered. CHARACTERS = ('a'..'z').to_a # Set this to true to see how the search progresses. VERBOSE = true # The maximum word length accepted. MAX_WORD_LENGTH = 50 # The dictionary given should be the location of a file containing one # word per line. The characters should be an array of all characters # that should be considered (i.e. no words with other characters are # included). def initialize(hangman_game, characters = CHARACTERS) @con = self.class.db_connection @characters = characters @hangman_game = hangman_game reset_tables prune_by_word_length @hangman_game.character_count end # Returns the guesses that the guesser would make. def guesses @guesses = [] log{ "There are #{word_count} potential words left." } while not @hangman_game.won? guess = next_guess raise 'The word is not in the dictionary.' if guess.nil? @guesses << guess log{ "Guessing #{guess}" } add_information(guess, @hangman_game.guess(guess)) log_state log{ "\n" } end return @guesses end class << self # Creates the database and populates it with the dictionary file # located at the specified location. Only considers the specified # characters (array). def create_database(dictionary = DICTIONARY_FILE, characters = CHARACTERS) @con = db_connection @characters = characters @tables = ['words'] + @characters + @characters.map{ |c| c + '_occurrences'} create_tables populate_tables File.open(dictionary) end # Connects to the database that should store the tables. def db_connection # Replace <username> and <password> with the database username and # password. Mysql.real_connect("localhost", <username>, <password>, "hangman") end private # Creates the tables used to store words. def create_tables # Drop old tables. @tables.each do |table| @con.query "DROP TABLE IF EXISTS `#{table}`" end # Words table. @con.query <<-"end_sql" CREATE TABLE `words` ( `word_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT, `word` varchar(#{MAX_WORD_LENGTH}) NOT NULL, `length` tinyint(3) unsigned NOT NULL, `removed` tinyint(1) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`word_id`), INDEX (`removed`), INDEX (`length`) ) ENGINE=MyISAM end_sql # Tables for the number of occurrences of each character. character_occurrences_table_template =<<-'end_template' CREATE TABLE `%s_occurrences` ( `word_id` mediumint(8) unsigned NOT NULL, `occurrences` tinyint(3) unsigned NOT NULL, PRIMARY KEY (`occurrences`, `word_id`), INDEX (`word_id`) ) ENGINE=MyISAM end_template # Tables for the positions of each character. character_table_template =<<-'end_template' CREATE TABLE `%s` ( `word_id` mediumint(8) unsigned NOT NULL, `position` tinyint(3) unsigned NOT NULL, PRIMARY KEY (`position`, `word_id`), INDEX (`word_id`) ) ENGINE=MyISAM end_template @characters.each do |character| @con.query character_occurrences_table_template % character @con.query character_table_template % character end end # Loads a dictionary into the database. def populate_tables(dictionary_file) # Disable the keys so that we don't update the indices while # adding. @tables.each do |table| @con.query("ALTER TABLE #{table} DISABLE KEYS") end # Prepare statements. add_word = @con.prepare( "INSERT INTO words (word, length) VALUES (?, ?)") add_character = {} add_character_occurrences = {} @characters.each do |character| add_character[character] = @con.prepare( "INSERT INTO #{character} (word_id, position) VALUES (?, ?)") add_character_occurrences[character] = @con.prepare( "INSERT INTO #{character}_occurrences " + "(word_id, occurrences) VALUES (?, ?)") end # Populate the database. previous_word = nil dictionary_file.each_line do |line| # Only consider words that only contain characters a-z. Make # sure we don't get duplicates. word = line.chomp.downcase next if word == previous_word or word =~ /[^a-z]/ or word.size > MAX_WORD_LENGTH # Add the word, its character positions and number of # occurrences. add_word.execute(word, word.size) word_id = @con.insert_id characters = word.split(//) characters.each_with_index do |character, position| add_character[character].execute(word_id, position) end @characters.each do |character| occurrences = characters.select{ |c| c == character }.size add_character_occurrences[character].execute( word_id, occurrences) end previous_word = word end # Generate the indices. @tables.each do |table| @con.query("ALTER TABLE #{table} ENABLE KEYS") end end end private # Logs the current state of the guessing process. def log_state log do messages = [] messages << @hangman_game.to_s count = word_count messages << "There are #{count} potential words left." if count <= 10 res = @con.query('SELECT word FROM words WHERE removed = 0') res.each{ |row| messages << row[0] } res.free end messages.join("\n") end end # Logs the string produced by the block (may not be executed at all). def log(&block) puts yield() if VERBOSE end # Gets the number of potential words left. def word_count res = @con.query('SELECT COUNT(*) FROM words WHERE removed = 0') count = res.fetch_row[0].to_i res.free return count end # Computes the next character that should be guessed. The next guess # is the character (that has not yet been tried) that occurrs in the # most words remaining. def next_guess next_character = nil max_count = 0 (@characters - @guesses).each do |character| res = @con.query( "SELECT COUNT(DISTINCT word_id) FROM #{character} " + "NATURAL JOIN words WHERE removed = 0") count = res.fetch_row[0].to_i res.free if count > max_count next_character = character max_count = count end end return next_character end # Adds the information about at what indices in the word the specified # character can be found to the guesser. def add_information(character, indices) if indices.empty? # The character isn't in the word. sql =<<-"end_sql" UPDATE words SET removed = 1 WHERE removed = 0 AND word_id IN ( SELECT word_id FROM #{character} ) end_sql else # Remove all words where the character isn't at the specified # places. sql =<<-"end_sql" UPDATE words NATURAL JOIN #{character}_occurrences SET removed = 1 WHERE removed = 0 AND (occurrences != #{indices.size} OR word_id IN ( SELECT word_id FROM #{character} WHERE position NOT IN (#{indices.join(', ')}) ) ) end_sql end @con.query(sql) end # Resets the table to start a new round of guesses. def reset_tables @con.query('UPDATE words SET removed = 0') end # Prunes all words that do not have the specified length. def prune_by_word_length(expected_length) @con.query( "UPDATE words SET removed = 1 WHERE length != #{expected_length}") end end opts = GetoptLong.new( [ '--help', '-h', GetoptLong::NO_ARGUMENT], ['--dictionary', '-d', GetoptLong::OPTIONAL_ARGUMENT]) opts.each do |opt, arg| case opt when '--help' RDoc::usage when '--dictionary' if arg != '' HangmanGuesser.create_database(arg) else HangmanGuesser.create_database end end end if ARGV.size != 1 abort "Incorrect usage, see --help" end game = Hangman.new(ARGV[0]) guesses = HangmanGuesser.new(game).guesses if game.won? puts 'Successfully guessed the word.' else game.lost? puts 'Failed guessing the word.' end puts "Made the following guesses: #{guesses.join(', ')}" puts "Expended a total of #{game.incorrect_guesses.size} lives." -- Andreas Launila