James Edward Gray II wrote: > On Dec 2, 2006, at 10:25 PM, William James wrote: > > > x =~ /["\n]|^\s|\s$/ > > What is that regex doing? Quoting any field with a quote or a > newline, in addition to any field beginning or ending with whitespace? > > That fails on a field containing a comma. Carriage returns also need > to be escaped in CSV. Easily remedied. > I have no idea what the whitespace tricks are > for either. The standard states: Leading and trailing space-characters adjacent to comma field separators are ignored. So quotes must be used to used to preserve that whitespace. > > A better test is: > > x.count(%Q{\r\n,"}).nonzero? As noted above, this fails to preserve leading and trailing whitespace. Gregory Brown wrote: > On 12/2/06, William James <w_a_x_man / yahoo.com> wrote: > > > It's easy to handle all cases since CSV is a simple format; > > no pompous prolixity is needed: > > > > puts ['x',' y ','He said, "No!"'].map{|x| x=x.to_s > > x =~ /["\n]|^\s|\s$/ ? '"' + x.gsub(/"/,'""') + '"' : x }.join(',') > > x," y ","He said, ""No!""" > > > > If that won't handle 100k rows, then fasterCsv probably won't either. > > It is indeed faster by a long shot, but it doesn't conform to the CSV > spec. (See JEG2's response) After the addition of the comma (and possibly the carriage return), it conforms. Remember that the _de facto_ standard is based on how Microsoft's programs handle CSV files. See http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm#FileFormat puts [9, ' y ', "fee, fi", "one\ntwo", 'He said, "No!"']. map{|x| x=x.to_s x =~ /[",\n\r]|^\s|\s$/ ? '"' + x.gsub(/"/,'""') + '"' : x }.join(',') 9," y ","fee, fi","one two","He said, ""No!"""