I got similar result with your parse_csv. This brings another issue in my mind: This method is also in ruby so why such a huge overhead when we use csv module vs. this method? How can we modify so that we can pass field seperator and record seperator as an argument? William James wrote: > William James wrote: > > > % class String > > % def parse_csv > > % a = self.scan( > > % %r{ "( (?: [^\\"] | \\")* )" | > > % '( (?: [^\\'] | \\')* )' | > > % ( [^,]+ ) > > % }x ).flatten > > % a.delete(nil) > > % a > > % end > > % end > > To test the method parse_csv, I created a 1 megabyte file consisting of > 4228 copies of > > a,b,"foo, bar",c > "foo isn't \"bar\"",a,b > a,'"just,my,luck"',b > 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9 > 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9 > 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9 > > Processing it using parse_csv took about 7 seconds on my computer, > which has a 866MHz pentium processor. > > Ruby's standard-lib csv.rb reported an error in the file's format. > > So I made a file containing 26907 copies of > > 111,222,333,444,555,666,777,888,999 > > Ruby's standard-lib csv.rb took about 35 seconds to process it; > parse_csv, about 5 seconds.