Issue #15770 has been updated by kou (Kouhei Sutou).

Assignee set to kou (Kouhei Sutou)
Status changed from Open to Closed

Thanks for your report.
csv 3.0.9 has been backported to 2.6 branch recently. So the next 2.6 includes the fix of this.

----------------------------------------
Bug #15770: CSV skip_lines param affects data 
https://bugs.ruby-lang.org/issues/15770#change-77641

* Author: skyksandr (Aleksandr Kunin)
* Status: Closed
* Priority: Normal
* Assignee: kou (Kouhei Sutou)
* Target version: 
* ruby -v: 2.6.*
* Backport: 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
It works on 2.5.\*, but doesn't work on 2.6.\*

```
require 'csv'
require 'date'

counter = 0

CSV.foreach('./05-31-20.CSV', skip_lines: /^[^0-9]{4}/) do |row|
  time = row[0]

  p time if time.length < 23
  counter += 1
end

p "Processed: #{counter} lines"
```

And the result is:

```
"03-09T09:40:04.00Z"
"Processed: 4424 lines"
```

So there are two problems:
1. Line 4424 got corrupted by slicing 5 symbols ("2019-")
2. Not whole file is parsed, total number of lines: **4497**

EDIT:
With regex `/^(?![0-9]{4})/` in addition to corrupt first field parser hangs in infinite loop.
Stack (to give you an idea where to look to):
```
	7: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:509:in `foreach'
	6: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:657:in `open'
	5: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:510:in `block in foreach'
	4: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv.rb:1176:in `each'
	3: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:265:in `parse'
	2: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:583:in `skip_needless_lines'
	1: from ~/.rbenv/versions/2.6.2/lib/ruby/2.6.0/csv/parser.rb:704:in `parse_row_end'
```

EDIT2: the issue is reproducible on `3.0.4`, but resolved on `csv 3.0.9`

---Files--------------------------------
05-31-20.CSV (491 KB)
bug.rb (515 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>