On Sun, 2001-10-14 at 20:02, Bill Kelly wrote: > > From: "Sean Middleditch" <elanthis / awesomeplay.com> > > > [...] > > escaped. Also, what about something like > > > > abc,def,\"abc,"123,456"\,xxy > > > > The escapes won't work. I once had a regexp that (in almost all cases) > > properly handled this, but I don't recall what it was. Unfortunately, > > I'm not talented enough at regexps to figure it out again without > > another hour of work. ^,^ I don't know if handling the \ escapes is > > important though for this situation (I had some text files at work that > > did need it, though... was a real pain). > > Here's one that should handle everything EXCEPT that pesky escaped > comma _outside_ the quoted string. :-( Is that for real ??? I > take it that ought to tokenize to > 'abc', 'def', '\"abc', '123,456\,xxy' ?????? > Ya, that was the tokenization I was looken for. I don't think I've ever seen an app do that, but after some inexperienced user decides to go hadn tweak stuff, things can get ugly... No, Iv'e never seen taht, but I am also a worst case scenario type person. ^,^ Also, I look at the rules of what the syntax means, and I always make sure my code can completely follow the rules no matter how weird. There is a 1 in a trillion chance it's needed, but oh well. I'm weird like that. ^,^ > That seems really weird because it gobbles quotes and yet concatenates > fields (as it were) with that escaped comma following the quotes. > I'd have expected the program generating the CSV to have output that > field as "\"123,456\",xxy" . . . which the below can handle, but . . . > > Anyway, for what it's worth :-) > > > require 'runit/testcase' > require 'runit/cui/testrunner' > > def csv_split(str) > str.scan(/(?:\A|,)\s*"((?:\\"|[^"])*)"|(?:\A|,)([^",]*|[^",][^,]*)(?=,|\z)/).flatten!.compact! > end > Jeez, Iit would take me hours to come up with that. If employers took regexp's on resumes, you could get a hell of a job with that. ~,^ > class TestCsvSplit < RUNIT::TestCase > def testCsvSplit > fields = csv_split(%q{"aaa",,"c,\"d\",",,,"fff",,,}) > assert fields == ['aaa', '', 'c,\"d\",', '', '', 'fff', '', '', ''] > fields = csv_split(%q{"\"",,"a\"\"b","\"c\"",}) > assert fields == ['\"', '', 'a\"\"b', '\"c\"', ''] > fields = csv_split(%q{abc,def,\"abc,"123,456",xxy}) > assert fields == ['abc', 'def', '\"abc', '123,456', 'xxy'] > fields = csv_split(%q{abc,def,\"abc,"\"123,456\",xxy"}) > assert fields == ['abc', 'def', '\"abc', '\"123,456\",xxy'] > end > end > > > RUNIT::CUI::TestRunner.run(TestCsvSplit.suite) > > > > Bill > >