Does this library have any practical value?  Probably not.  It's been suggested
in the Perl community that hacks like this are a good minor deterrent to those
trying to read source code you would rather keep hidden, but it must be stressed
that this is no form of serious security.  Regardless, it's a fun little toy to
play with.

It was mentioned in the discussion that Perl, where ACME::Bleach comes from,
includes a framework for source filtering.  It can be used to make modules that
modify source code much as we are doing in this quiz.  Perl's Switch.pm is a
good example of this, but ironically ACME::Bleach is not.

That naturally leads to the question, can you build source filters in Ruby? 
Clearly we can build ACME::Bleach, but not all source filters are as simple I'm
afraid.  Consider this:

	#!/usr/local/bin/ruby -w

	require "fix_my_broken_syntax"

	invalid++

Now the thought here is that fix_my_broken_syntax.rb will read my source, change
it so that it does something valid, eval() it, and exit() before the invalid
code is an issue.  Here's a trivial example of fix_my_broken_syntax.rb:

	#!/usr/local/bin/ruby -w

	puts "Fixed!"
	exit

Does that work?  Unfortunately, no:

	$ ruby invalid.rb 
	invalid.rb:5: syntax error
	invalid++
	         ^

Ruby never gets to loading the library, because it's not happy with the syntax
of the first file.  That makes writing a source filter for anything that isn't
valid Ruby syntax complicated and if it is valid Ruby syntax, you can probably
just code it up in Ruby to begin with.

Except for whiteout.rb, our version of ACME::Bleach.

You can't build Ruby constructs out of whitespace alone, so some form of source
filtering is required.  Luckily, we can get away with the approach described
above for this source filter, because a bunch of whitespace (with no code) is
valid Ruby syntax.  It just doesn't do anything.  Ruby will skip right over our
whitespace and load the library that restores and runs the code.

Most people took this approach.  Let's examine one such example by Robin
Stocker:

	#!/usr/bin/ruby

	#
	# This is my solution for Ruby Quiz #34, Whiteout.
	# Author::  Robin Stocker
	#

	#
	# The Whiteout module includes all functionality like:
	# - whiten
	# - run
	# - encode
	# - decode
	#
	module Whiteout

	  @@bit_to_code = { '0' => " ", '1' => "\t" }
	  @@code_to_bit = @@bit_to_code.invert
	  @@chars_to_ignore = [ "\n", "\r" ]

	  #
	  # Whitens the content of a file specified by _filename_.
	  # It leaves the shebang intact, if there is one.
	  # At the beginning of the file it inserts the require 'whiteout'.
	  # See #encode for details about how the whitening works.
	  #
	  def Whiteout.whiten( filename )
	    code = ''
	    File.open( filename, 'r' ) do |file|
	      file.each_line do |line|
	        if code.empty?
	          # Add shebang if there is one.
	          code << line if line =~ /#!\s*.+/
	          code << "#{$/}require 'whiteout'#{$/}"
	        else
	          code << encode( line )
	        end
	      end
	    end
	    File.open( filename, 'w' ) do |file|
	      file.write( code )
	    end
	  end
	
	  # ...

First, we can see that the module defines some module variables, which are
really used as constants here.  Their contents hint at the encoding algorithm
we'll see later.

Then we have a method for managing the transformation of the source into
whitespace.  It starts by opening the passed file and reading the code
line-by-line.  If the first line is a shebang line, it's saved in the variable
code.  Next, a "require 'whiteout'" line is added to code.  Finally, all other
lines from the file are appended to code after being passed through an encode()
method we'll examine shortly.  With the contents read and transformed, the
method then reopens the source for writing and dumps the modifications into it.

The next method is the reverse process:

	  # ...
	
	  #
	  # Reads the file _filename_, decodes and runs it through eval.
	  #
	  def Whiteout.run( filename )
	    text = ''
	    File.open( filename, 'r' ) do |file|
	      decode = false
	      file.each_line do |line|
	        if not decode
	          # We don't want to decode the "require 'whiteout'",
	          # so start decoding not before we passed it.
	          decode = true if line =~ /require 'whiteout'/
	        else
	          text << decode( line )
	        end
	      end
	    end
	    # Run the code!
	    eval text
	  end
	
	  # ...

This method again reads the passed file.  It skips over the "require 'whiteout'"
line, then copies the rest of the file into the variable text, after passing it
through decode() line-by-line.  The final line of the method calls eval() on
text, which should now contain the restored program.

On to encode() and decode():

	  #
	  # Encodes text to "whitecode". It works like this:
	  # - Chars in @@char_to_ignore are ignored
	  # - Each byte is converted to its bit representation,
	  #   so that we have something like 01100001
	  # - Then, it is converted to whitespace according to @@bit_to_code
	  # - 0 results in a " " (space)
	  # - 1 results in a "\t" (tab)
	  #
	  def Whiteout.encode( text )
	    white = ''
	    text.scan(/./m) do |char|
	      if @@chars_to_ignore.include?( char )
	        white << char
	      else
	        char.unpack('B8').first.scan(/./) do |bit|
	          code = @@bit_to_code[bit]
	          white << code
	        end
	      end
	    end
	    return white
	  end

	  #
	  # Does the inverse of #encode, it takes "white"
	  # and returns the decoded text.
	  #
	  def Whiteout.decode( white )
	    text = ''
	    char = ''
	    white.scan(/./m) do |code|
	      if @@chars_to_ignore.include?( code )
	        text << code
	      else
	        char << @@code_to_bit[code]
	        if char.length == 8
	          text << [char].pack("B8")
	          char = ''
	        end
	      end
	    end
	    return text
	  end

	end
	
	# ...

The comments in there detail the exact process we're looking at here, so I'm not
going to repeat them.

Note that @@char_to_ignore contains "\n" and "\r" so they are not translated. 
The effect of that is that line-endings are untouched by this conversion.  Some
solutions used such characters in their encoding algorithm.  The gotcha there is
that any line-ending translation done to the modified source (say FTP through
ASCII mode) will break the hidden code.  Robin's solution doesn't have that
problem.

Here's the code that ties all those methods into a solution:

	# ...
	
	#
	# And here's the logic part of whiteout.
	# If it was run directly, whites out the files in ARGV.
	# And if it was required, decodes the whitecode and runs it.
	#
	if __FILE__ == $0
	  ARGV.each do |filename|
	    Whiteout.whiten( filename )
	  end
	else
	  Whiteout.run( $0 )
	end

Again, the comment saves me some explaining.

That was Robin's first solution to a Ruby Quiz, but I never would have known
that from looking at the code.  Thanks for sharing Robin!

Obviously, a conversion of this type grossly inflates the size of the source. 
Around eight times the size, to be exact.  A couple of solutions used zlib to
control the expansion, which I thought was clever.  By compressing the source
and then encoding() (and using a base three conversion) Dominik Bathom got
results around three times the inflation instead.

Ara.T.Howard took a different approach, using whiteout.rb as a database to store
the trimmed files.  That was a very interesting process, demonstrated well in
the submission email.  The advantages to this approach would be no inflation
penalty and the code stays readable (just not in the original location).  The
disadvantage I see is that it requires the exact same library to be present both
at encoding and decoding, which probably makes sharing the altered code
impractical.

As always, my thanks to all who gave this little diversion an attempt.  I'm sure
we'll see tons of whitespace only code on RubyForge in the future, thanks to our
efforts.

Tomorrow begins part one of our first two-part Ruby Quiz.  Stay tuned...