Issue #16842 has been reported by sawa (Tsuyoshi Sawada).

----------------------------------------
Bug #16842: `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable
https://bugs.ruby-lang.org/issues/16842

* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.8.0dev (2020-05-09T13:24:57Z master 889b0fe46f) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
The UTF-8 character U+0085 (NEXT LINE) is not printable, but `inspect` prints the character verbatim (within double quotation):

```ruby
0x85.chr(Encoding::UTF_8).match?(/\p{print}/) # => false
0x85.chr(Encoding::UTF_8).inspect
#=> "\"
\""
```

My understanding is that non-printable characters are not printed verbatim with `inspect`:

```ruby
"\n".match?(/\p{print}/) # => false
"\n".inspect #=> "\"\\n\""
```

while printable characters are:

```ruby
"a".match?(/\p{print}/) # => true
"a".inspect # => "\"a\""
```

I ran the following script, and found that U+0085 is the only character within the range U+0000 to U+FFFF that behaves like this.

```ruby
def verbatim?(char)
  !char.inspect.start_with?(%r{\"\\[a-z]})
end

def printable?(char)
  char.match?(/\p{print}/)
end

(0x0000..0xffff).each do |i|
  begin
    char = i.chr(Encoding::UTF_8)
  rescue RangeError
    next
  end
  puts '%#x' % i unless verbatim?(char) == printable?(char)
end
```



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>