Hi all,
I'm writing web crawler with threads support. And after working with
some amount of link memory usage increase more and more. When program
just started in use 20mb of mem. After crawling of 150-200 link, memory
usage ~100. When 1000 link crawled my program may use up to 1GB of mem.
Help me please find out why?

require 'rubygems'
require 'mechanize'
require 'hpricot'
require 'yaml'
require 'net/http'
require 'uri'
require 'modules/common'

Thread.abort_on_exception = true
$config = YAML.load_file "config.yml"
links = IO.read("bases/3+4.txt").split("\n")
threads = []

links.each do |link|
  if Thread.list.size < 50 then
    threads << Thread.new(link) { |myLink|
      Common.post_it(myLink)
    }
  else
    sleep(1)
    threads.each { |t|
      unless t.status then
        t.join
      end
    }
    puts "total threads: " + Thread.list.size.to_s
    redo
  end
end

threads.each { |t| t.join() }


What in "Common" module:
1. Crawler (net/http or mechanize - I tried both, results the same)
2. HTML parser (Hpricot or Nokogir - I tried both again, with same bad
results)
so I extract some data from page and save it to the file. Nothing
special as you see.

When I start this program without threads I getting the same results :(

Please help, is this my fault or something wrong in the libraries ?
-- 
Posted via http://www.ruby-forum.com/.