On Sat, 21 Feb 2004, Hal Fulton wrote:

> Ara.T.Howard wrote:
> > don't limit yourself to scripting tasks!  i'm currenlty designing an entire
> > near-realtime satelite image processing system in ruby!
> 
> Ara,
> 
> That's pretty cool. What can you tell us about this without violating some
> kind of confidentiality policy? :)

ha!  check out the url in my sig for general info...

i can't elaborate much at the moment as i'm slammed with a deadline but the
_design_ so far is:

  - various job classes utilizing image processing classes that use guy's mmap
    module (considering some stuff using narray).  our algorithims are all
    custom so no packaged lib would work.  eg. we aren't doing an normal image
    ops - we go through pixels and scanlines applying custom algorithims to
    them to detect stuff (nighttime lights of the world).  this is the
    scientist's bag - i avoid it and consider only values 0-255 and issues of
    massive i/o.  typical files are 100mb each and explode into about 1.5 GB
    out output.  we currently have programs that _require_ 2.5mb of ram.  it's
    absolutely ridiculous..

  - an entire system built around sge (sun grid engine) and seriously heavy
    duty netapps and mass storage units - lots of data


essentially we process incoming satelite data and archive it.  in addition
certain clients subscribe to have us do custom data processing/delivery for
them.  the current system - isn't - it's a massive pile of scripts that call
scripts that call scripts...

  1024.times{ puts "scripts that call scripts" }

....

i jest not.


the central problems are

  - the processing is the 'science'.  eg. it changes almost daily
  - the volumes of data and cpu requirements present all sorts of challenges
  - setting a 'subscription' (custom processing for a customer) requires a
    ground-up approach.  there is _zero_ reusable code currently
  - the logging is non-existent
  - we can't use a database (politics)
  - our sysads are horrible

my current work has been

  - figure out a persistence strategy.  i wanted to use pstore for alot of it
    but couldn't since it didn't support locking over nfs.  see
    
      http://raa.ruby-lang.org/list.rhtml?name=posixlock

  - configurable work flows see

      http://raa.ruby-lang.org/list.rhtml?name=flow

  - controlling various external process (idl, c program, fortran, etc) in a
    way that allows fine control over output see

      http://raa.ruby-lang.org/list.rhtml?name=session

  - quite a few in-house packages to deal with various data formats and do
    various calculations see

      http://www.codeforpeople.com/lib/ruby/dmsp/
      http://www.codeforpeople.com/lib/ruby/envi/
      http://www.codeforpeople.com/lib/ruby/solpos/
      http://www.codeforpeople.com/lib/ruby/stpjob/

all the above is a literal snapshot of what i'm doing - as is normal around
here i haven't had much time for docs...  sorry.


at the exact moment i'm working on a configurable set of jobs that generates
fire products using nighttime satelite imagery.  it's basically a testbed for
what the near-realtime system might be but i am still very much in the
'enumerate the problem hotspots and check them off' phase.  i still haven't
been able to check them all off so my 'design' ( a strong term ) is very much
in flux - suggestions (espcially in form of solved.tar.gz) are welcome!  ;-)


a year ago i designed a bi-temporal database emulator (ruby classes) and a web
interface to it.  that system was almost all ruby and is in production here
are noaa. see

  http://www.fsl.noaa.gov/publications/forum/feb2003/2_03_MMeta.html
  http://raa.ruby-lang.org/list.rhtml?name=btpgsql

and specifically

  http://www.codeforpeople.com/lib/ruby/btpgsql/btpgsql-0.2.4/doc/

for quite a long discussion of the database part.  the web interface is,
unfortunately, viewable only from the intranet.  it's based on fastcgi and
postgresql though.


> Also, there's a "real world Ruby" page in the wiki somewhere.  This might
> make a nice addition.

when time allows...

back to work...

cheers.

-a


-- 
===============================================================================
| EMAIL   :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE   :: 303.497.6469
| ADDRESS :: E/GC2 325 Broadway, Boulder, CO 80305-3328
| URL     :: http://www.ngdc.noaa.gov/stp/
| TRY     :: for l in ruby perl;do $l -e "print \"\x3a\x2d\x29\x0a\"";done 
===============================================================================