On Sun, 2001-10-14 at 20:02, Bill Kelly wrote:
> 
> From: "Sean Middleditch" <elanthis / awesomeplay.com>
> >
> [...]
> > escaped.  Also, what about something like
> > 
> > abc,def,\"abc,"123,456"\,xxy
> > 
> > The escapes won't work.  I once had a regexp that (in almost all cases)
> > properly handled this, but I don't recall what it was.  Unfortunately,
> > I'm not talented enough at regexps to figure it out again without
> > another hour of work.  ^,^  I don't know if handling the \ escapes is
> > important though for this situation (I had some text files at work that
> > did need it, though... was a real pain).
> 
> Here's one that should handle everything EXCEPT that pesky escaped
> comma _outside_ the quoted string.  :-(  Is that for real ???  I
> take it that ought to tokenize to
> 'abc', 'def', '\"abc', '123,456\,xxy' ??????
> 

Ya, that was the tokenization I was looken for. 

I don't think I've ever seen an app do that, but after some
inexperienced user decides to go hadn tweak stuff, things can get
ugly...

No, Iv'e never seen taht, but I am also a worst case scenario type
person.  ^,^  Also, I look at the rules of what the syntax means, and I
always make sure my code can completely follow the rules no matter how
weird.  There is a 1 in a trillion chance it's needed, but oh well.  I'm
weird like that.  ^,^

> That seems really weird because it gobbles quotes and yet concatenates
> fields (as it were) with that escaped comma following the quotes.
> I'd have expected the program generating the CSV to have output that
> field as "\"123,456\",xxy" . . . which the below can handle, but . . .
> 
> Anyway, for what it's worth  :-)
> 
> 
> require 'runit/testcase'
> require 'runit/cui/testrunner'
> 
> def csv_split(str)
>     str.scan(/(?:\A|,)\s*"((?:\\"|[^"])*)"|(?:\A|,)([^",]*|[^",][^,]*)(?=,|\z)/).flatten!.compact!
> end
> 

Jeez, Iit would take me hours to come up with that.  If employers took
regexp's on resumes, you could get a hell of a job with that.  ~,^

> class TestCsvSplit < RUNIT::TestCase
>     def testCsvSplit
>         fields = csv_split(%q{"aaa",,"c,\"d\",",,,"fff",,,})
>         assert fields == ['aaa', '', 'c,\"d\",', '', '', 'fff', '', '', '']
>         fields = csv_split(%q{"\"",,"a\"\"b","\"c\"",})
>         assert fields == ['\"', '', 'a\"\"b', '\"c\"', '']
>         fields = csv_split(%q{abc,def,\"abc,"123,456",xxy})
>         assert fields == ['abc', 'def', '\"abc', '123,456', 'xxy']
>         fields = csv_split(%q{abc,def,\"abc,"\"123,456\",xxy"})
>         assert fields == ['abc', 'def', '\"abc', '\"123,456\",xxy']
>     end
> end
> 
> 
> RUNIT::CUI::TestRunner.run(TestCsvSplit.suite)
> 
> 
> 
> Bill
> 
>