Issue #7156 has been updated by naruse (Yui NARUSE).
Status changed from Feedback to Rejected
The argument of URI need to be escaped.
Maybe Ruby support non escaped URI when browser's URL handling becomes concrete.
----------------------------------------
Bug #7156: Invalid byte sequence in US-ASCII when using URI from std lib
https://bugs.ruby-lang.org/issues/7156#change-74539
* Author: t0d0r (Todor Dragnev)
* Status: Rejected
* Priority: Normal
* Assignee: naruse (Yui NARUSE)
* Target version:
* ruby -v: 1.9.3
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
Invalid byte sequence in US-ASCII on ruby 1.9.3
I receive that error when trying to open url with bulgarian text (utf-8: "§ª§ã§ä§à§â§Ú§ñ"). It seems that the problem is in uri/common.rb from ruby standard library...
adding str.force_encoding(Encoding::BINARY) to following method fix the problem
class URI::Parser
def escape(str, unsafe = @regexp[:UNSAFE])
unless unsafe.kind_of?(Regexp)
# perhaps unsafe is String object
unsafe = Regexp.new("[#{Regexp.quote(unsafe)}]", false)
end
str.force_encoding(Encoding::BINARY) # FIX
str.gsub(unsafe) do
us = $&
tmp = ''
us.each_byte do |uc|
tmp << sprintf('%%%02X', uc)
end
tmp
end.force_encoding(Encoding::US_ASCII)
end
end
One more suggestion - maybe US_ASCII must be replaced to Encoding::BINARY too?
---Files--------------------------------
bulgarian.rb (61 Bytes)
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>