2008/4/8, Michael Linfield <globyy3000 / hotmail.com>:
> Robert Klemme wrote:
>  > 2008/4/8, Michael Linfield <globyy3000 / hotmail.com>:
>
> >>  output2 = output[356131..712260]
>  >>     end
>
> >>  Any ideas that would speed this up are much appreciated!! Otherwise I'll
>  >>  be back in 3 months IF I dont get an error :D
>  >
>  > Obviously there is a lot of code missing from the piece above.  Can
>  > you explain, what you are trying to achieve?  What is your input file
>  > format and what kind of transformation do you want to do on it?  I
>  > looked through your other postings but it did not become clear to me.
>  >
>  > Cheers
>  >
>  > robert
>
>
> Alright heres the breakdown of everything.
>
>
>  dataArray = []
>
>  # arrayOut consist of all integer data stored in a text file.
>  # its called upon via IO.foreach("data.txt"){|x| dataArray << x}
>  # dataArray being just a predefined array ie: dataArray = []
>
>
>  output = arrayOut.to_s.chop!.split(",")
>
>
> #Each of these outputs breaks down this huge array into 4 smaller arrays
>
> output1 = output[0..356130]
>  output2 = output[356131..712260]
>  output3 = output[712261..1068390]
>  output4 = output[1068391..1424521]
>
>
> #hashRange[out] is basically calling a hash in the following context.
>  # hash = { 1=> { 20000..30000 => 12345 } }
>  #so 'out' is calling the range of the key to which contains its defined
>  value
>  #basically its saying hashRange[25000]    #=> 12345   as an example
>
>  #everything imported to dataArray is a string, so it must be converted
>  to an
>  #integer to correctly match the range key
>
>  #after benchmarking some elements of the loop below its found to be
>  #the push = hashRange[out] is whats slowing everything down.
>  #everything a nil 'out' is shoved into the query it takes about 8sec.
>  #when its a correct number, takes about 5sec
>
>  #the hashRange file is about 78mb, to which I had to load in as
>  #8 separate data files, then shove those into an eval to convert it
>  #to a hash
>
>
>  count = 0
>     output1.each do |out|
>       out = out.to_i
>       push = hashRange[out]
>       dataArray << push
>       count+=1
>       puts "#{push} - #{count}"     #Testing purposes
>     end
>
>
> #I guess what I need now is a faster way to access this pre-defined
>  hash.
>  #SQL is one possibility but that could be considered a whole other
>  #forum post :)
>
>  Any other questions feel free to ask,
>  Your guy's insight is much appreciated.

Let's see whether I understood correctly: you have a file with
multiple integer numbers per line.  You have defined a range mapping,
i.e. each interval an int can be in has a label.  You want to read in
all ints and output their labels.

If this is correct, this is what I'd do:

$ ruby -e '20.times {|i| puts i}' >| x
14:54:37 /c/Temp
$ ./rl.rb x
low
low
medium
medium
medium
high
high
high
high
high
no label
no label
no label
no label
no label
no label
no label
no label
no label
no label
14:54:41 /c/Temp
$ cat rl.rb
#!/bin/env ruby

class RangeLabels
  def initialize(labels)
    @labels = labels.sort_by {|key,lab| key}
  end

  def lookup(val)
    # slow, this can be improved by binary search!
    @labels.each do |key, lab|
      return lab if val < key
    end
    "no label"
  end
end

rl = RangeLabels.new [
  [2, "low"],
  [5, "medium"],
  [10, "high"],
]

ARGF.each do |line|
  first = true
  line.scan /\d+/ do |val|
    if first
      first = false
    else
      print ", "
    end

    print rl.lookup(val.to_i)
  end

  print "\n"
end
14:54:52 /c/Temp
$

Kind regards

robert

-- 
use.inject do |as, often| as.you_can - without end