Jordan Callicoat wrote:
..snip snip
> Also, you could use a little trick with Hash; just index the rows in a
> hash by their date, then when you hit a duplicate date, you'll just
> overwrite the previous value indexed by that date (change the order of
> reading in file1 and file2 to keep historical rows rather than new
> ones, the current order keeps new rows):
> 
> hash = {}
> data = File.readlines(file1) +
>        File.readlines(file2)
> data.each { |row|
>   date = row[5..12]
>   hash[date] = row
> }
> data = hash.values.sort
> 
> Regards,
> Jordan

Jordan,

Thanks for the suggestion.  I am implementing the hash idea you 
provided.  That way I can keep my historical data (for common dates) and 
just grab new data for new dates.

Now just a little tweak.  The symbol data is not always just 3 
characters.
I am currently using a regular expression to split out the values, so I 
get the date from the split.

           #Using Jordan's methodology
           hash = {}
           data = File.readlines(path) + File.readlines(second)
           data.each { |row|
              (sym, date, open, high, low, close, vol) = row.split(/,/)
              hash[date] = row
            }
           data = hash.values.sort
           open(second, 'w') { |f| f.puts data}


Works fine, but I really don't care about the information past the date 
field.
Additionally, I have another set of files that have a column after the 
vol, I am not sure how to handle it in the regular Expression.

I just want to do:
  (symbol, date, ignore_the_rest) = row.split(/,/) for just the first 
two columns.  I am off to read more on regular expressions.

Thanks
Snoopy

-- 
Posted via http://www.ruby-forum.com/.