On Sat, 10 Sep 2005, Glenn M. Lewis wrote: > Thanks a bunch, Hugh and Eric! The combination of your > two suggestions sped it up quite a bit. > > I don't agree with Robert, though... I have written many > parsers in C++ (and before that, C) that could soak up > all the data that I'm reading in less than a second whereas > this is taking approximately 9 minutes in Ruby. With the > recommendations of Hugh and Eric, it is now down to about > 5 minutes, or almost a factor of 2 speedup. > > I would really like an order of magnitude or more, but > I would definitely have to write it in a compiled language. > I've done this before with Ruby and C++ using SWIG, but > this particular one seemed really challenging when having > Ruby call C++ which would then call Ruby... > > My last project with Ruby/C++/SWIG had Ruby calling C++ > but C++ kept all the data structures internally without > ever calling Ruby, and this was *much* easier... but not > as flexible as I would like for this case. > > I may have to rewrite this whole puppy in D if I'm going > to get parsing times under one second. Using C++ and STL > for its map containers is a royal nuisance, but D has > built-in associative arrays. Or maybe I should try Perl > or Python and see how their file parsing speeds compare. > > Oh, and to answer Hugh's question, it is extremely rare > that a line would have less than 8 fields... sometimes > the last line of the file has only a ^Z on it. > > Thanks again for your help! I appreciate it. > -- Glenn can you send a sample data set (contact me offline if you wish) and expected time to parse and let us have a crack? those times sounds distressing - is your data HUGE? cheers. -a -- =============================================================================== | email :: ara [dot] t [dot] howard [at] noaa [dot] gov | phone :: 303.497.6469 | Your life dwells amoung the causes of death | Like a lamp standing in a strong breeze. --Nagarjuna ===============================================================================