Oliver Cromm wrote: > The speed difference looks too extreme too me: > ... > What exactly is so slow here? Good Question. The first problem is that you are using a general purpose CSV parser to split strings. However, the difference you report is too extreme for that to be the only issue. I created four test cases: ruby split: caps = [] File.open(fn).each {|line| caps << line.chomp.split(',')[0] } rio split: caps = [] rio(fn).chomp.lines { |line| caps << line.split(',')[0] } stdlib csv: caps = [] File.open(fn).each {|line| caps << CSV.parse_line(line)[0] } rio csv: caps = rio(fn).csv.columns(0)[].flatten Benchmarking these cases on a 10000 line CSV file yielded: ruby split: 0.516000 rio split : 0.984000 stdlib csv: 3.047000 rio csv : 15.610000 This shows that Rio incurs a 2x overhead when reading lines from a file, which is reasonable, considering the features of Rio not illustrated in this trivial example. Using the standard library's CSV incurs 6x overhead, which seems a bit high but is not unreasonable, considering the difference in complexity between splitting a string and parsing a CSV line. The CSV module could probably be more efficient. Using Rio to call the standard library's CSV incurs a 5x overhead above calling the standard library's CSV. This yields an overhead of 30x compared to the stdlib split. This is close to what you report (28x). The 5x overhead incured when using Rio to call CSV does seem too high. One would expect it to be closer to 2x. The reason for the high overhead is the feature of Rio that extends every Array returned from a CSV file with a custom +to_s+ method, which will convert the Array back to a CSV line. Without this feature the "rio csv" case yields: rio csv : 5.750000 which is a 1.9x over the stdlib CSV. I was dubious that extending each Array was a good thing even if it cost nothing. It is certainly not a good thing with such a high perfomance penalty. I will remove this feature in the next release. Beyond this, the only thing that will make Rio's handling of CSV files is a faster CSV module (FasterCSV perhaps) and perfomance improvements in Rio, which will be addressed when Rio reaches Alpha. Thanks for bringing this to my attention. Cheers, -Christopher