On Thu, 23 Mar 2006 the.liberal.media / gmail.com wrote:

> I'm undertaking a project that will eventually become a processing pipeline
> application of sorts.  It will receive data in a common format, transform it
> into one of _many_ other formats, compile and send it off to various
> endpoints (sounds spammish, but it's all solicited, really :).
>
> My team and I have tentatively decided on openMosix to provide an easily
> scalable cluster, and Ruby for the application itself.  I'm very new to
> Ruby, and fairly new to IPC concepts -- The Little Book of Semaphores and
> the many threads I've read here have helped me out a _lot_.
>
> Our application will rely on 3 sets of consumer process pools; each pool is
> spawned by a daemon responsible for each basic operation (think MTA):
> receive, transform, compile/send.  By using processes instead of threads we
> allow openMosix to migrate each process and make use of the entire cluster.
>
> So we have the model down, but I need a bit of advise on how to most
> efficiently get these processes talking.  What is the best form of IPC to
> use here?  It seems there are tons of Ruby examples on concurrency and
> communication between threads, but I can't seem to find anything definitive
> on IPC (to more than one child at least).  I tried the sysv extension off
> RAA, but couldn't get it to compile -- though I didn't try my best.
>
> Things I'm considering:
>
> - DRb
> - UNIXSocket
> - mkfifo
> - SysV message queue (openMosix doesn't support shmem segments)
> - popen (though I can't see how to do it without round robin
> producing)
>
> If anyone has any advice, please shove me in the right direction.
>
> Thanks in advance!
>
> Also, thanks to matz for the great language (code blocks are uber
> bueno)!
>
> Best,
> Dan

i recently built a system __exactly__ like this for noaa.  it's built upon
ruby queue (rq) for the clustering and dirwatch for the event driven
components.  both are on rubyforge and/or raa.  the system uses a uniq library
that allows classes to be parameterize, loaded, and run on input data in a few
short lines.  here is one class representing a processing flow

   class Flo5 < NRT::OLSSubscription::Geotiffed
     mode "production"

     roi 47,16,39,27
     satellites %w( F15 F16 )
     extensions %w( OIS )

     solarelevations -180, -12, 10.0

     hold 0

     username "flo"
     password "xxx"

     orbital_start_direction "descending"
   end

this is one hundred percent of the coding needed to inject a new processing
flow into the system, have incoming files spawn jobs for it, distribute
processing to a cluster, and to package and deliver data.

i'd like to do a write up about it in the near future but i've just got alpha
and am still pretty busy.  in any case there are at least 6 or 7 components
that can be used from this system in any other such system.  ping me on or off
line and i can give you some more info... right now i'm late for a meeting...

   http://www.linuxjournal.com/article/7922
   http://raa.ruby-lang.org/project/rq/
   http://raa.ruby-lang.org/project/dirwatch/

kind regards.

-a
-- 
share your knowledge.  it's a way to achieve immortality.
- h.h. the 14th dali lama