Issue #7156 has been updated by naruse (Yui NARUSE).

Status changed from Feedback to Rejected

The argument of URI need to be escaped.
Maybe Ruby support non escaped URI when browser's URL handling becomes concrete.

----------------------------------------
Bug #7156: Invalid byte sequence in US-ASCII when using URI from std lib
https://bugs.ruby-lang.org/issues/7156#change-74539

* Author: t0d0r (Todor Dragnev)
* Status: Rejected
* Priority: Normal
* Assignee: naruse (Yui NARUSE)
* Target version: 
* ruby -v: 1.9.3
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
Invalid byte sequence in US-ASCII on ruby 1.9.3

I receive that error when trying to open url with bulgarian text (utf-8: "ڧ"). It seems that the problem is in uri/common.rb from ruby standard library...

adding str.force_encoding(Encoding::BINARY) to following method fix the problem

class URI::Parser
  def escape(str, unsafe = @regexp[:UNSAFE])
    unless unsafe.kind_of?(Regexp)
      # perhaps unsafe is String object
      unsafe = Regexp.new("[#{Regexp.quote(unsafe)}]", false)
    end
    str.force_encoding(Encoding::BINARY) # FIX
    str.gsub(unsafe) do
      us = $&
        tmp = ''
      us.each_byte do |uc|
        tmp << sprintf('%%%02X', uc)
      end
      tmp
    end.force_encoding(Encoding::US_ASCII)
  end
end

One more suggestion -  maybe US_ASCII must be replaced to Encoding::BINARY too?

---Files--------------------------------
bulgarian.rb (61 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>