Issue #8405 has been updated by naruse (Yui NARUSE).

Target version set to current: 2.1.0


----------------------------------------
Bug #8405: CSV module - improper regexp for escaping special characters
https://bugs.ruby-lang.org/issues/8405#change-39337

Author: dunric (David Unric)
Status: Assigned
Priority: Normal
Assignee: JEG2 (James Gray)
Category: lib
Target version: current: 2.1.0
ruby -v: 2.0.0p0
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


=begin
There seems to be bug in csv.rb module. If you would like to use some special characters like (({|})) as a quote_char (passed as a parameter to CSV methods like read), program terminates with (({CSV::MalformedCSVError: Missing or stray quote in line xxx})) error message even if the input .csv file is correct.

Bellow is the assignment of the Regexp used for escaping special symbols used in regular expressions:

  1587:  @re_chars =   /#{%"[-][\\.^$?*+{}()|# \r\n\t\f\v]".encode(@encoding)}/

The issue is with the leading (({[-]})) which I find completely wrong and causes miss of all matches it was intended to. The hyphen char "(({-}))" has to be escaped only inside brackets (({[]})) and only if it does not immediately follow the left bracket.

The quick fix for the above issue may look like

  1587:  @re_chars =   /#{%"(?<!\\[)-(?=.*\\])|[\\.^$?*+{}()|# \r\n\t\f\v]".encode(@encoding)}/

I'd like to mention it would also match strings including right bracket without its left counterpart but it doesn't matter anyway. Lookbehind doesn't support quantifiers in Ruby so it would require to rewrite whole substitution code where applied.
=end



-- 
http://bugs.ruby-lang.org/