Issue #12034 has been updated by Yui NARUSE.


Eric Wong wrote:
> I think that is fine as long as the strings are valid.
> Returning invalid strings is the main problem, I think;
> and we should stop doing that. Dir.entries and similar methods
> have the same problem.
>
> How about fall back to ASCII-8BIT if we detect broken code range?

How should Ruby treat invalid paths is difficult problem.
Once I decided it is filesystem encoding but I agree to change if another encoding is practically better.

In this case both filesystem encoding and ASCII-8BIT won't work because it will raise Encoding::CompatibilityError
on paths.join even if it returns ASCII-8BIT strings instead of invalid filesystem encoding strings.

As an another use case to simply show filenames, retrieving filenames including invalid strings,
and call String#scrub works fine.

Therefore at this time I don't think changing into ASCII-8BIT isn't good thing.

----------------------------------------
Feature #12034: RegExp does not respect file encoding directive
https://bugs.ruby-lang.org/issues/12034#change-56976

* Author: Vit Ondruch
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
~~~
$ cat regexp-encoding.rb
# -*- encoding: binary -*-
puts ''.encoding
puts //.encoding

$ ruby regexp-encoding.rb 
ASCII-8BIT
US-ASCII
~~~

The RegExp should have ASCII-8BIT encoding IMO.




Actually there is something different how Ruby 2.3 behaves with regards to encoding, since I cannot compile raindrops gem with Ruby 2.3 anymore due to this test error:

~~~
 1) Error:
TestLinux#test_unix_resolves_symlinks:
RegexpError: /.../n has a non escaped non ASCII character in non ASCII-8BIT script
    /builddir/build/BUILD/rubygem-raindrops-0.13.0/usr/share/gems/gems/raindrops-0.13.0/lib/raindrops/linux.rb:57:in `unix_listener_stats'
    /builddir/build/BUILD/rubygem-raindrops-0.13.0/usr/share/gems/gems/raindrops-0.13.0/test/test_linux.rb:97:in `test_unix_resolves_symlinks'
~~~

This is the line where it fails:

http://bogomips.org/raindrops.git/tree/lib/raindrops/linux.rb#n57



---Files--------------------------------
0001-string.c-rb_external_str_with_enc-fall-back-to-ASCII.patch (1.47 KB)
0002-follow-up-for-OS-X.patch (1.52 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>