Greetings ruby fans,

I'm a greenhorn at this cool lang ruby.  Much to learn.  Perhaps you
chaps could help me with an issue I have.  I've read through a number of
the post on sorting Arrays and Hashes.  And yet I can't seem to put my
finger on the solution.  I want to sort on the second column.  So it
seemed from what information I gathered, that I need to gather my
results into a hash.  Am I on the right track?  Oh, let me tell you what
your looking at here; I am scanning each mail file in our queue for
commonalites (spammer) instead of the useless (my opinoin) qmHandle we
have for qmail.  So, I've got a working prototype.  If you could help me
on my sort and if you have any other comments/suggestions to throw my
way I'm sure I could learn a thing or two.  Being new to ruby, there's a
lot of new ideas here.  Thank guys.

Code:
#!/usr/local/bin/ruby -w
require 'find'

@results = Array.new

# Iterate through the child directories & call the parse file method
def scan_dirs
  root = "/var/qmail/queue/mess"
  Find.find(root) do |file|
    parse_file(file)
  end
  @results.sort!
  print_results
end

# Parse each file for the information we want
def parse_file(path)
  file =  path[(path.length-7), path.length]
  sourceip = ""
  email = ""
  subject = ""
  email_found = false
  line_no = 0

  File.open(path, 'r').each do |line|

    line = line.strip # Remove any \n\r nil, etc
    line_no += 1

    if line_no == 1
      if line.match("invoked for bounce")
        # Internal Bounce Msg
        sourceip = "SMTP"
      end
    end

    if (line_no == 2 and sourceip.empty?)
      if line.match("webmail.commspeed.net")
        sourceip = "Webmail"
      else
        sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
        if sourceip.empty?
          sourceip = "No Source IP**"
        end
      end
    end

    if (line.match("SquirrelMail") and sourceip == "Webmail") or
       (line.match("From:") and sourceip != "Webmail")
       if email.empty?
           email = get_email(line)
       end
    end

    if line.match("Subject:") and subject.empty?
      subject = truncate(line,50)
    end

    if line_no == 20 #Nothing more we want to read in the file
    @results << ["#{file}", "#{sourceip}", "#{email}", "#{subject}"]
      line_no = 0
      return
    end
  end
end

# Truncate subject line
def truncate(string, width)
  if string.length <= width
    string
  else
    string[0, width-3] + "..."
  end
end

# Print out results
def print_results
  print "\e[2J\e[f"

  print "Mess#".ljust(10," ")
  print "Source".ljust(18," ")
  print "Email Addrress".ljust(30, " ")
  print "Subject".ljust(50, " ")
  1.times { print "\n" }
  111.times { print "-" }
  1.times { print "\n" }

  @results.each do |line|
    print line[0].ljust(10," ")
    print line[1].ljust(18," ")
    print line[2].ljust(30, " ")
    print line[3].ljust(50, " ")

    1.times { print "\n" }
  end
end

# Get email address from line/string
def get_email(line_to_parse)
  # Pull the email address from the line
  line_to_parse.scan(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i).flatten
end

# Ok, begin our scan
scan_dirs
exit

Partial results listing: (I've modified the content to protect privacy)
Mess#     Source            Email Addrress                Subject
---------------------------------------------------------------------------------------------------------------
3360108   111.111.17.1      hobby / emailhost.net
3360167   111.111.7.213     hunter / emailhost.ner          Subject:
Removed to protect the innocent....
3360186   Webmail           fisher / emailhost.net          Subject:
Removed to protect the innocent
3360209   111.111.40.10     curator / aneatmuseum.org
3360215   111.111.15.110    blueprints / emailhost.net      Subject:
Removed to protect the innocent
3360217   111.111.9.248     user1 / emailhost.net           Subject:
Removed to protect the innocent
3360226   111.111.11.43     user / emailhost.net            Subject:
Removed to protect the innocent
3360228   111.111.16.34     user / emailhost.net            Subject:
Pictures
3360241   111.111.18.73     joe / agooduser.com            Subject:
Removed to protect the innocent
3360242   111.111.14.109    user / emailhost.net            Subject:
Emailing: maps.htm
-- 
Posted via http://www.ruby-forum.com/.