--	k4FxW0BwJSi2a18DYir
Content-Type: text/plain
Content-Transfer-Encoding: 7bit

[apologies for the long post here]

Script attached. Text contains spoilers of the worst sort.

On Tue, 2005-01-04 at 01:36 +0900, Glenn Parker wrote:
> My apologies, I was not aware of the suggested time constraint.

No need to apologize. I was free to stop any time. :)

> Nobody spent more than a week's worth of spare time on 
> it, so it seemed to be within bounds for a weekly quiz.  I also think 
> the recent tic-tac-toe problem was of comparable difficulty.

Actually, if I weren't incompetent, it would have taken me a lot less
time. I'm just amused that I had an easier time with the optimized
solver than the brute force solver. Brute force is supposed to be
simple, right? 

> > Now , thanks to crypto3.txt, I will add to the list of ways to foil
> > solvers: write messages that are nearly unsolvable in the first
> > place. :)
> 
> Again, my apologies if that puzzle proved too difficult for some.  I 
> assure that there is at least one solution.  I created these puzzles 
> myself to avoid copyright issues, and it's hard to evaluate their 
> difficulty.  I do know that lots of medium-length words in which no 
> letters are re-used makes it harder, and crypto3.txt is a fine example 
> of that.

I'm just curious about how fast your Ruby solver was able to give a
useful answer on that one. And on what kind of hardware.

> Sometimes, looking at the problem from a more practical standpoint, a 
> complete computer-generated solution is overkill.  If you scan through 
> your algorithm's partial results, you may get enough insight to finish 
> the solution on your own.  This is especially true when your algorithm 
> is stumbling on a proper noun, e.g. an author's name, that is not in 
> your dictionary.

No doubt. Certainly my program did a fairly poor job on all the puzzles,
but in the case of the first two the right answer was obvious to the
human reading it.

Anyway...

My script uses patterns to build test maps for each token in the cipher
text. So if it sees uhixl in the cipher text the pattern is just abcde,
if it sees owwdcd the pattern is abbcdc, etc.

First it reads in the dictionary and stores all of the words in a hash
keyed on word pattern.

Next it reads in the cipher text, figures out the pattern for each
token, gets all the words from the dictionary that match the pattern,
and sorts the list of tokens according to token length (on the
assumption that longer words are more likely to have unique patterns).

Third it sets up an array in which to hold potentially valid maps (all
the maps are held as sets like (ab, cd, ef) which would map the token
ace to bdf) and stocks it with the maps for the first token.

Then it loops over each remaining token, doing a comparison of the
possible maps for that token with the existing set of potentially valid
maps.

For each comparison it figures out how many cipher letters the two sets
of sets have in common. If they have no common letters it performs a
cartesian product of the two sets (which is dangerous when the two sets
are large!), if they have common letters it takes the union of the two
sets where the maps agree on what a mapping is.

Examples: (ab, cd) & (ab, ce)  ) because the sets don't agree on
mapping for c. (ab, cd) & (cd, ef)  ab, cd, ef) because the sets agree
on how to map the characters they have in common.

My script is not very polished, requires the dictionary to be present in
the same directory with the name "dict.txt" and accepts the cryptogram
files as an argument: so `ruby decrypt.rb crypto1.txt` does the magic.

My program output is still somewhat rudimentary, but it gives this for
the first gram, unfortunately it gets excited about the middle name
which it has wrong, but which takes away 'u' for use in its correct
mapping:

denims is one per bent inspiration ninety nine per bent perspiration
t?o?as aqua e?ison
denims is one per cent inspiration ninety nine per cent perspiration
t?o?as aqua e?ison
denims is one per gent inspiration ninety nine per gent perspiration
t?o?as aqua e?ison
denims is one per lent inspiration ninety nine per lent perspiration
t?o?as aqua e?ison
denims is one per vent inspiration ninety nine per vent perspiration
t?o?as aqua e?ison
denims is one per went inspiration ninety nine per went perspiration
t?o?as aqua e?ison

The second gram it got 100%:

the difference between the almost right word and the right word is the difference between the lightning bug and the lightning mark twain

The third gram results in:

migf afwc gf ruffs lsc coisr mantra i elsr is go valise wi?f maf efl
cowls mangle
migf afwh gf ruffs lsh hoisr mantra i elsr is go valise wi?f maf efl
howls mangle
migf afwj gf ruffs lsj joisr mantra i elsr is go valise wi?f maf efl
jowls mangle
migf afwy gf ruffs lsy yoisr mantra i elsr is go valise wi?f maf efl
yowls mangle

Which is perhaps a quote from a Joyce novel. ;)

Glenn, thanks for the challenge. James, thanks for Ruby Quiz. First time
I've tried it but the past weeks have been fun to watch.

 -Michael C. Libby, www.andsoforth.com

--	k4FxW0BwJSi2a18DYir
Content-Disposition: attachment; filename=decrypt.rb
Content-Type: text/plain; name=decrypt.rb; charset=UTF-8
Content-Transfer-Encoding: 7bit

#!/usr/bin/ruby -w

require 'set'

$dict  ile.readlines("dict.txt").map{|x| x.chomp}
$patterns  }


$cipher_text  ile.readlines(ARGV[0]).map{|x| x.chomp}.join(" ")
$cipher_tokens  cipher_text.split
$cipher_token_map_sets  }


def combine_map_sets(a, b)
  puts "combining map sets"
  return b if a.length 0
  return a if b.length 0

  puts "both have some members (a  {a.length}, b  {b.length})"
  matched  ]

  com_let  ommon_letters(a[0], b[0])

  if com_let 0 then
    puts "no common letters - making cartesian product"
    a.each do |as|
      b.each do |bs|
        sas  et.new(as)
        sbs  et.new(bs)
        matched << (sas | sbs)
      end
    end
  else
    puts "common letters: #{com_let}"
    puts "trying all 'a' maps against all 'b' maps"
    a.each do |as|
      b.each do |bs|
        sas  et.new(as)
        sbs  et.new(bs)
        comb_set  as & sbs
        
        if comb_set.length com_let then
          matched << (sas | sbs)
        end
      end
    end
  end
  return matched.map{|m| valid_map?(m)}.compact
end

def common_letters(a, b)
  a_init  et.new(a.map{|x| x[0].chr})
  b_init  et.new(b.map{|x| x[0].chr})
  puts "a has cipher characters: #{a_init.inspect}"
  puts "b has cipher characters: #{b_init.inspect}"
  return (a_init & b_init).length
end

def decrypt(maps)
  puts "decrypting"
  plain_texts  ]
  maps.each do |map|
    plain_texts << $cipher_text.tr(*map_to_tr(map))
  end
  return plain_texts
end

def make_cipher_token_map_sets
  puts "getting maps for each cipher token"
  $cipher_tokens.each do |token|
    next if $cipher_token_map_sets.has_key?(token) #no need to solve the same token twice
    $cipher_token_map_sets[token]  ap_sets(token)
    puts "#{token} has #{$cipher_token_map_sets[token].length} possible maps"
  end
  
end

def make_patterns_from_dict
  puts "making pattern dict"
  $dict.each do |word|
    p  attern(word)
    $patterns[p]  ] unless $patterns.has_key?(p)
    $patterns[p] << word
  end
end

def map_sets(token)
  sets  ]
  token_pattern  attern(token)
  if $patterns.has_key?(token_pattern) then
    $patterns[token_pattern].each do |dict_word|
      this_set  ]
      0.upto(token.length - 1) do |i|
        c  oken[i].chr
        m  ict_word[i].chr
        this_set << "#{c}#{m}"
      end
      sets << Set.new(this_set)
    end
  end
  return sets
end

def map_to_tr(map_set)
  mapping  }
  map_set.each do |map_pair|
    mapping[map_pair[0].chr]  ap_pair[1].chr
  end
  cipher  '
  plain  '
  %w{a b c d e f g h i j k l m n o p q r s t u v w x y z}.each do |x|
    cipher << x
    if mapping.has_key?(x) then
      plain << mapping[x]
    else
      plain << '?'
    end
  end
  return cipher, plain
end

def pattern(word)
  pattern  '
  maps  }
  map_space  w{a b c d e f g h i j k l m n o p q r s t u v w x y z}
  word.each_byte do |b|
    c  .chr
    maps[c]  ap_space.shift unless maps.has_key?(c)
    pattern << maps[c]
  end
  return pattern
end

def valid_map?(map)
  #make map into hash for easy testing
  mapping  }
  map.each do |map_pair|
    c  ap_pair[0].chr #cipher
    p  ap_pair[1].chr #plain
    return nil if c p #cipher cannot map to self
    return nil if mapping.has_key?(c) #cipher cannot map same cipher twice
    mapping[c]  
  end

  cipher_count  apping.keys.length
  plain_count  apping.values.uniq.length
  return nil if plain_count < cipher_count #some plain was not uniq

  return map
end

#####################################################################
# The fun begins...
#
make_patterns_from_dict
make_cipher_token_map_sets

$sorted_cipher_tokens  cipher_tokens.sort{|x, y|
  y.length <x.length
}
puts "sorted list of cipher tokens: #{$sorted_cipher_tokens.join(', ')}"

#seed the matched set, this is the union of all token maps that have matching map pairs 
matched_set  cipher_token_map_sets[$sorted_cipher_tokens[0]]

1.upto($sorted_cipher_tokens.length - 1) do |i|
  new_match  ombine_map_sets(matched_set,
                               $cipher_token_map_sets[$sorted_cipher_tokens[i]]
                               )
  #if nothing matched, just keep the old set? why not.
  matched_set  ew_match unless new_match []
end

puts matched_set.inspect
puts decrypt(matched_set).join("\n")

--	k4FxW0BwJSi2a18DYir--