Jason,

Is the line 6334 that shows up in the traceback this line:

>   consumers.each{|th| th.join}

And one tip, which may not have anything to do with this problem but 
might make your code easier to understand and/or debug: Since threading 
is so bloody difficult, I try to make it affect as little of the 
program as possible. In a case like your code, for example, I would've 
let the threaded part simply handle the loading of the web pages, but 
let the parsing happen afterward when all the threads have been joined 
again. This is how FeedBlender (http://feedblender.rubyforge.org/) does 
it, so that way if there's a bug I can figure out if it's because of 
the threading or not.




On Jan 8, 2005, at 8:29 PM, Jason N.Perkins wrote:

>
> On Jan 8, 2005, at 7:21 PM, Bill Atkins wrote:
>
>> Can you post the code?
>
> Sure. The blogs variable is an array of the urls of blogs - I intend 
> to eventually have these urls stored in MySQL, but for now an array 
> works. I emptied that array so that those sites that I have in it 
> aren't getting hit by too many people trying to help out. The 
> threading is derived from a sample in "Programming Ruby." I'd love any 
> additional feedback outside of dealing with the timeout issue.
>
>
> #! /usr/local/bin/ruby -w
>
> require 'open-uri'
> require 'thread'
>
> blogs = [ ]
>
> buffer=Queue.new
>
> # load the blogs into the queue
> blogs.each do |blog|
>   buffer.enq( blog )
> end
>
> consumers = (1..150).map do |i|
>   Thread.new("consumer #{i}") do |name|
>     begin
>       blog = buffer.deq
>       open( blog ) do |content|
>         begin
>           metas = content.read.scan( /<meta([^(>]*)>/m ).uniq
>           metas.each do |current_meta|
>             current_meta = current_meta.to_s
>
>             if current_meta =~ /\s+name\s*=\s*[\"']([^\"']+)[\"']/
>               name = $1
>               current_meta =~ /\s+content\s*=\s*[\"']([^\"']+)[\"']/
>               content = $1
>
>               case name
>               when "geo.position"
>                 print "#{blog} \t #{content} \n"
>
>               when "ICBM"
>                 print "#{blog} \t #{content} \n"
>               end
>             end
>           end
>         rescue Exception
>           p "#{blog}: $! \n"
>         end
>       end
>     end until buffer == :END_OF_WORK
>   end
> end
>
> begin
>   consumers.size.times{ buffer.enq(:END_OF_WORK) }
>   consumers.each{|th| th.join}
> rescue Exception
>   print $!
> end
>
>
>
>
> --
> Jason N Perkins
> <http://sneer.org/>
>
>
>

Francis Hwang
http://fhwang.net/