I used a bit of freedom:

* It reads from STDIN instead of ARGV.

* Output is binary instead of 0-and-1-as-text.

* Both encoding and decoding, for the extra points...

* In order to avoid juggling with padding characters, the least
  frequent character is 1110 instead of 111 (in the example in
  the quiz). Every "character" in the encoded bit stream ends
  with a 0. Yes, it costs one bit, but since it's not used
  frequently, it doesn't really hurt.

* 0 and 1 become 1 and 0, respectively, because we get the
  padding zeroes for free when we do "001".pack("B*"). So,
  rewriting the previous section: In order to avoid juggling
  with padding characters, the least frequent character is 0001
  instead of 000. Every "character" in the encoded bit stream
  ends with a 1.

Here's the code: http://tinyurl.com/3x5nsy

Example:

 $ echo -n  Scooby-Dooby-Doo | ruby huffman.rb | ruby huffman.rb -d
 Compression: 14/16 (87%)
 Decompression: 14/16 (87%)
 Scooby-Dooby-Doo

How does it work?

First of all, we count each unique byte in the original byte
stream (e.g. "Scooby-Dooby-Doo"):

 {"-"=>2, "b"=>2, "y"=>2, "c"=>1, "D"=>2, "o"=>6, "S"=>1}

... which is sorted:

 [["o", 6], ["-", 2], ["D", 2], ["b", 2], ["y", 2], ["S", 1], ["c",
1]]

... and cleaned:

 ["o", "-", "D", "b", "y", "S", "c"]

The leftmost character is used most often and the rightmost
character is used least often. We call this ordered list the
"alphabet".

Secondly, we build the translation table, where each character
in the "alphabet" is mapped to a bit sequence:

 {"b"=>"0001", "-"=>"01", "c"=>"0000001", "y"=>"00001", "D"=>"001",
"o"=>"1", "S"=>"000001"}

We can translated each byte of the original byte stream:

 ["S", "c", "o", "o", "b", "y", "-", "D", "o", "o", "b", "y", "-",
"D", "o", "o"]

... to the corresponding bit stream:

 ["000001", "0000001", "1", "1", "0001", "00001", "01", "001", "1",
"1", "0001", "00001", "01", "001", "1", "1"]

... which is joined into a string:

 "00000100000011100010000101001110001000010100111"

... which is packed (with pack("B*")) into another string:

 "\004\016!N!N"

... which we'll call the "encoded bit stream".

In order to be able to decode the message, we have to serialize
the "alphabet" and concatenate it with the "encoded bit stream":

 alphabet  + bit stream     -> message
 "o-DbySc" + "\004\016!N!N" -> "o-DbySc\004\016!N!N"

But where's the boundary between "alphabet" and "encoded bit
stream"? Well, let's add the length of the "alphabet"
([7]..pack("C")):

 len  + alphabet  + bit stream     -> message
 "\a" + "o-DbySc" + "\004\016!N!N" -> "\ao-DbySc\004\016!N!N"

Conclusion? "Scooby-Dooby-Doo" becomes "\ao-DbySc\004\016!N!N".

Well, we've saved 2 bytes, but it doesn't sound that well...

gegroet,
Erik V. - http://www.erikveen.dds.nl/