rio4ruby wrote:

> James Edward Gray II wrote:
>> On Feb 28, 2006, at 7:59 PM, Oliver Cromm wrote:
>>
>>> The speed difference looks too extreme too me:
>>>
>>>
>>>   caps = []
>>>   File.open('caps_u8.dic').each {|line| caps << line.split(';')[0]}
>>>
>>> => 1.8 seconds
>>
>> Here you are rolling your own split.
>>
>>>   require 'rio'
>>>   caps = rio('caps_u8.dic').csv(";").columns(0)[].flatten
>>>   p caps
>>>
>>> => 50.9 seconds
>>
> 
> This is a false comparison. The speedy code will not properly parse
> many CSV files.

I didn't claim they are equivalent in principle; but for the purpose at
hand, they are. And in this case, I wouldn't have cared if one version
takes 5 times as long, but 25 times is not practicable - that speed
difference would easily justify, say, 15 minutes more time for
programming, so I could cover a lot of cases.

> For example, the following is a legal line from a CSV file:
> 
>   "Field 1","Hello, World", "Field 3"

I doubt that a split(/\",\s*\"/) (plus necessary adjustments) would be
much slower.
-- 
Oliver C.
  Die demoskopische Hauptzielgruppe von "Focus" sind Maenner aus dem 
  gehobenen Mittelstand zwischen 40 und 65 (IQ, nicht Alter).
    Andreas Kabel in de.etc.sprache.deutsch