On Thu, 24 Feb 2011 12:09:48 +0900, Philip Rhoades wrote: > People, > > I have script that does: > > - statistical processing from data in 50x32x20 (32,000) large input > files > > - writes a small text file (22 lines with one or more columns of > numbers) > for each input file > > - read all small files back in again for final processing. > > Profiling shows that IO is taking up more than 60% of the time - > short of > making fewer, larger files for the data (which is inconvenient for > random > viewing/ processing of individual results) are there other > alternatives to > using the "File" and "IO" classes that would be faster? > > Thanks, > > Phil. I can think of two approaches here. First, you can write one large file (perhaps creating it in memory first) and then splitting it afterwards. Second, if you're on *nix, you can write your output files to a tmpfs. Both should reduce number of seeks and improve performance. -- WBR, Peter Zotov.