ara.t.howard wrote:
<snip>
> 4) bj, on the otherhand, simply provides a way to fire and forget system 
> calls.  these system calls just may happen to use ./script/runner to run 
> some code from within your rails environment, but that's up to you.  it 
> may even contact a long running daemon like backgrounddrb to avoid 
> loading your rails app over and over, but again that's up to you.  bj 
> does *not* load your rails app or make that code available in any way.  
> all it does is connect to the db and run jobs from a queue - which is 
> another big difference: bj is a priority queue, you can submit 100,000 
> jobs and forget about it, they will run serially in the background until 
> they are complete.  another result of the design is that you can easily 
> fire up runners on other hosts using bj - thereby creating a *cluster* 
> of machines that run jobs on behalf of your front end(s) rails 
> application.  and, of course, it's easy for development to submit jobs 
> into a production queue and vise versa.  the last major difference is 
> that bj is queuing job in the database whereas backgrounddrb is dealing 
> with memory/context/closures - if you have backgrounded 100k credit card 
> sales and your application crashes you can probably guess where having 
> the jobs live would be best ;-)  with bj the act of submitting a job is 
> a db transaction that's submitted a job which can run on it's own two 
> feet so you *know* once submission is complete that, no matter what 
> happens next, that job is recoverable - at least to the extent your 
> database/fs are.

My word.  I think you've just saved me a ton of work.  Yet again.

A quick question, though.  How difficult is it to set up parallel job 
queues, so that a cluster node can pick up jobs from one queue, process 
them, and submit them to the next in a chain?  Take a search engine's 
spider as an example - from 20,000 feet you've got a job that fetches a 
page, a job to parse the contents, followed by a third to index the 
parsed structure.  Chances are that you want different types of cluster 
node to work on each type of job, and there's different data that you 
might want to attach at each stage.  Is that easy to set up?

-- 
Alex