On Sat, 2 Dec 2006, Paul Lutus wrote:

> Joel VanderWerf wrote:
>
>> Paul Lutus wrote:
>>> ara.t.howard / noaa.gov wrote:
>>>
>>>> for years i've felt that i should be able to pipe numerical output into
>>>> some unix command like so
>>>>
>>>>    cat list | mean
>>>>    cat list | sum
>>>>    cat list | minmax
>>>>
>>>> etc.
>>>>
>>>> and have never found one.  right now i'm building a ruby version -
>>>> before i continue, does anyone know a standard unix or ruby version of
>>>> this?
>>>
>>> It is so easy to create in Ruby, a matter of minutes, that it is not
>>> terribly important to do the search you are suggesting.
>>
>> Disagree.
>
> It's a bit too late to disagree, in the face of the evidence that I said it,
> then I did it.

i agree that it's easy to emulate awk, but shouldn't we do something better in
ruby?  i'm personally always inspired by ruby's elegance to write something
better and more exstensible than something i could easily do in the
shell/awk/perl/c/etc and find that, over the long run (say more than 3 days)
i've found that my productivity increases in an exponential way if i simply
embrace ruby's power to write clear and re-usable code and code it right 'the
first time.'  imho it's a shame to write throw-away scripts in ruby.

here's what i've got so far:  the concept is that each line may contain 'n'
columns of numbers, which is to say the input is not a simple list of numbers,
but a list of __rows__ of numbers: a table.  any non-numeric data is ignored,
eliminating the need to grep out crud.  also, integer arithmitic is attempted
where possible but the code falls back to floats when needed.  all numeric
input must be valid - no use of #to_i or #to_f, preferring Integer() and
Float().  the code abstracts all of the input, computation, and output
functions and is user-extensible via the use of duck-typed filters.  it's also
usable both as a library or from the command-line


first some examples of usage:


   mussel:~/eg/ruby/listc > cat input.a
   1
   2
   3

   mussel:~/eg/ruby/listc > ./listc sum < input.a
   6

   mussel:~/eg/ruby/listc > ./listc mean < input.a
   2.0


   mussel:~/eg/ruby/listc > cat input.b
   1 2
   3 4
   5 6

   mussel:~/eg/ruby/listc > ./listc median < input.b
   3.0 4.0


   mussel:~/eg/ruby/listc > cat input.c
   foo 1 bar 2
   a 3 b 4
   x 5 y 6

   mussel:~/eg/ruby/listc > ./listc minmax < input.c
   1:5 2:6

   mussel:~/eg/ruby/listc > ./listc min < input.c
   1 2

   mussel:~/eg/ruby/listc > ./listc max < input.c
   5 6



   mussel:~/eg/ruby/listc > cat input.d
   ---
   -
     elapsed : 770.1453289
   -
     elapsed : 620.9993257
   -
     elapsed : 1440.629573

   mussel:~/eg/ruby/listc > ./listc mean < input.d
   943.924742533333



now the code (i'm not golfing, for you non-vim users strange markers are
'folds': those lines appear as one single line to me):


   mussel:~/eg/ruby/listc > cat ./listc
   #! /usr/bin/env ruby

   class Main
   #--{{{
     OPS = %w( sum add mean avg median max min minmax )

     def main
       op = ARGV.shift.to_s.strip.downcase

       klass =
         case op
           when 'sum', 'add'
             SumFilter
           when 'mean', 'avg'
             MeanFilter
           when 'median'
             MedianFilter
           when 'minmax'
             MinMaxFilter
           when 'max'
             MaxFilter
           when 'min'
             MinFilter
           else
             abort "bad op <#{ op }> not in <#{ OPS.join ',' }>"
         end

       filter = klass.new

       $stdin.each{|line| filter << line}

       filter.result >> $stdout
     end
   #--}}}
   end

   def Main(*a, &b) Main.new(*a, &b).main end

   module FilterUtils
   #--{{{
     def extract_numbers line
       fields = line.strip.split(%r/\s+/)
       fields.map{|f| Integer(f) rescue Float(f) rescue nil}.compact
     end

     class List < Array
       def >> port = STDOUT
         port << join(' ')
         port << "\n"
       end
       def self.from other
         new.instance_eval{ replace other; self }
       end
     end
     def new_list l = nil
       l ? (List === l ? l : List.from(l)) : List.new
     end

     class MultiList < Array
       def >> port = STDOUT
         port << map{|elem| elem.join(':')}.join(' ')
         port << "\n"
       end
       def self.from other
         new.instance_eval{ replace other; self }
       end
     end
     def new_multilist ml = nil
       ml ? (MultiList === ml ? ml : MultiList.from(ml)) : MultiList.new
     end
   #--}}}
   end

   class SumFilter
   #--{{{
     include FilterUtils
     attr 'sum'
     def initialize
       @sum = new_list
     end
     def << line
       numbers = extract_numbers line
       numbers.each_with_index do |n,i|
         @sum[i] ||= 0
         @sum[i] += n
       end
     end
     def result
       @sum
     end
   #--}}}
   end

   class MeanFilter
   #--{{{
     include FilterUtils
     attr 'sum'
     attr 'count'
     def initialize
       @sum = new_list
       @count = new_list
     end
     def << line
       numbers = extract_numbers line
       numbers.each_with_index do |n,i|
         @sum[i] ||= 0
         @count[i] ||= 0
         @sum[i] += n
         @count[i] += 1
       end
     end
     def result
       mean = new_list
       @sum.zip(@count){|s,c| mean << (s.to_f/c.to_f)}
       mean
     end
   #--}}}
   end

   class MedianFilter
   #--{{{
     include FilterUtils
     attr 'min'
     attr 'max'
     def initialize
       @min = new_list
       @max = new_list
     end
     def << line
       numbers = extract_numbers line
       numbers.each_with_index do |n,i|
         @min[i] ||= n
         @min[i] = [ @min[i], n ].min
         @max[i] ||= n
         @max[i] = [ @max[i], n ].max
       end
     end
     def result
       median = new_list
       @min.zip(@max){|mi,ma| median << (mi + ((ma - mi)/2.0))}
       median
     end
   #--}}}
   end

   class MinMaxFilter
   #--{{{
     include FilterUtils
     attr 'min'
     attr 'max'
     def initialize
       @minmax = new_multilist
     end
     def << line
       numbers = extract_numbers line
       numbers.each_with_index do |n,i|
         @minmax[i] ||= [n,n]
         @minmax[i][0] = [ @minmax[i][0], n ].min
         @minmax[i][1] = [ @minmax[i][1], n ].max
       end
     end
     def result
       @minmax
     end
   #--}}}
   end

   class MinFilter < MinMaxFilter
   #--{{{
     def result
       new_list @minmax.map{|minmax| [minmax.first]}
     end
   #--}}}
   end

   class MaxFilter < MinMaxFilter
   #--{{{
     def result
       new_list @minmax.map{|minmax| [minmax.last]}
     end
   #--}}}
   end

   Main() if __FILE__ == $0




of course this cod isn't perfect, but if i'm going to spend time adding a list
of numbers i'm going to put in at least this much effort.


kind regards.


-a
-- 
if you want others to be happy, practice compassion.
if you want to be happy, practice compassion.  -- the dalai lama