Interesting. I'll do a bit of comparison work between that approach  
and native Python "threading" support. I'd assume the dRB approach to  
be slower.   Just out of curiosity, has any work been done towards  
improving the threading system? The current system is great for  
systems that might not fully support threading out of the box, but  
I'd really like to see support for the POSIX threads at the system  
level.

-Jeff



On Sep 30, 2005, at 12:38 AM, Rick Nooner wrote:

> On Fri, Sep 30, 2005 at 12:20:07PM +0900, Jeff McNeil wrote:
>
>> Greetings.
>>
>> I apologize if this has been answered somewhere obvious. I did a fair
>> bit of Googling prior to piping up. If this has been addressed,
>> please feel free to simply point and grunt!
>>
>> In a nutshell, I have concerns surrounding the Ruby threads
>> implementation as it appears to be a home-grown system.  How does
>> would performance stack up against FreeBSD 5.x KSD threads?  I've run
>> into problems with the Python dummy_thread module before with respect
>> to performance, especially when dealing with intensive IO. I have a
>> bit of a fear that I'll have the same issue here.
>>
>> I've been tasked with writing a dynamic HTTPS/HTTP gateway of sorts,
>> and as such, it ought to get quite busy in the network IO
>> department.  I'd like to take advantage of thread pooling and  
>> whatnot.
>>
>> I'd really love to use Ruby as I'm quickly falling in love with it -
>> what a nice language.
>>
>> Thoughts?
>>
>> Jeff
>>
>
> I wrote a data collection system that gathers statistics from over
> 2000 servers every five minutes 24/7.  This has both high network
> usage patterns as well as high disk usage patterns.
>
> I also had a 4 processor box (a Sun E4500).  Ruby threads cannot
> take advantage of a multiprocessor server.
>
> In order to have each collection cycle finish within the 5 minute
> window alloted, all 4 processors must be fully utilized.
>
> The architecture that I settled on was similar to a threaded worker
> pool, except instead of threads I used processes with the main
> process acting as the scheduler and the child processes reading
> work tasks from a distributed queue (Ruby makes this easy with
> Rinda).  This allows scaling both by adding more processors
> (and processes) to the server OR because of Rinda (and dRB)
> simply by adding more collection servers.
>
> So far, this solution has been running over a year and a half
> on the single 4 processor server with no unscheduled down time.
>
> My take is don't use the Ruby threading model for
> network intensive tasks.  Rather think about using dRB
> and process level parallelism.  You might be suprised
> at how well it works and how scalable this makes your
> system.
>
> Rick
>
> -- 
> Rick Nooner
> rick / nooner.net
> http://www.nooner.net
>
>
>