Great news!
I discovered that I really only need to parse about 20% of the lines...
I put in one more optimization... specifically this one:
....>>> next unless contract
and now the parsing time goes down to 26 seconds!!! YAHOO!!!
This is *very* tollerable!!! The final version is below...
Thanks so much for everybody's help!!!
(The data files total about 6Megs, by the way.)
-- Glenn
def Contract.parseFile(file)
return if @@files.has_key?(file)
return unless File.exists?(file)
@@files[file] = 1
print "Parsing file #{file}..."
File.open(file, "rb") do |io|
io.each_line do |line|
fields = line.chomp.split(/,/)
next if fields.size < 8
contract = Contract.open(fields[0])
next unless contract
datestring = fields[1]
# year = datestring[0..1].to_i
# month = datestring[2..3].to_i
# day = datestring[4..5].to_i
year, month, day = datestring[0..5].scan(/../).collect {|s| s.to_i }
# puts "year=#{year}, month=#{month}, day=#{day}"
# RUBY BUG?!? Can't use 'date' here... as that messes up findTick's 'date'...
mydate = Time.local((year < 50 ? year+2000 : year+1900), month, day)
tick = Tick.new(mydate, fields[2].to_f, fields[3].to_f,
fields[4].to_f, fields[5].to_f,
fields[6].to_i, fields[7].to_i)
contract.addTick(tick)
end
end
puts "done"
end
Ara.T.Howard wrote:
> can you send a sample data set (contact me offline if you wish) and
> expected
> time to parse and let us have a crack? those times sounds distressing - is
> your data HUGE?
>
> cheers.
>
> -a