All -

I've written a script to split a .csv file into smaller .csv files of
40,000 lines each. The intent here is to break the file down enough so
that excel does not have issues reading each chunk. My code takes a
filename from the command line and breaks it down as so:

infile -> xyz.csv

output -> xyz_part_1.csv
          xyz_part_2.csv
          etc...

My code is working but I don't find it very "rubyish". In particular, I
hate having my index and counter counters and I don't like that I had to
declare my header variable outside of the loop. Bear in mind here that I
can not do something like "rows = CSV.open(infile)" because ruby will
yell and error as the input file is too big (250 mb). Any advice on
making the code nicer is appreciated. The current code is as follows:

require 'csv'

infile = ARGV[0] if ARGV[0] != nil

counter = 1
index = 0
header = ""
writer = CSV.open(infile.gsub(/\./,"_part_"+counter.to_s+"."),'w')

CSV.open(infile, 'r') do |row|
  if(index != 0 && index%40000 == 0)
    writer.close
    counter+=1
    writer = CSV.open(infile.gsub(/\./,"_part_"+counter.to_s+"."),'w')
    writer << header
  end
  if (index == 0)
    header = row
  end
  writer << row
  index += 1
end

writer.close()

-- 
Posted via http://www.ruby-forum.com/.