Issue #11814 has been updated by Akinori MUSHA.


I gave up with this idea for now because I thought the use cases would not expand as wide as expected and it'd be not enough just to add valid_encoding?(enc) if you got serious about encoding detection. (Sorry usa-san!)

However, since this issue is raised, let me share one good use case for future viewers.

Suppose you have a list of byte arrays which you don't know which encoding they are encoded in, like when you want to guess the encoding of the file names stored in a zip file.

So, if you had String#valid_encoding?(enc) you could achieve it like this without modifying, copying or concatenating strings:

~~~
POSSIBLE_ENCODINGS = [Encoding::UTF_8, Encoding::Windows_31J, Encoding::ISO_8859_1, Encoding::ASCII_8BIT]

encoding = byte_arrays.inject(POSSIBLE_ENCODINGS) { |encs, b|
  encs & POSSIBLE_ENCODINGS.select { |enc| b.valid_encoding?(enc) }
}.first
~~~

----------------------------------------
Feature #11814: String#valid_encoding? without force_encoding
https://bugs.ruby-lang.org/issues/11814#change-55523

* Author: Usaku NAKAMURA
* Status: Rejected
* Priority: Normal
* Assignee: 
----------------------------------------
Now we have to set a encoding to a string to validate it, just like:

```ruby
str.force_encoding('euc-jp').valid_encoding?  # => true or false
```

But to modify the string is not so smart.
knu-san requires the way to validate a string without modifiing it [*1].

Then, I propose to add an optional encoding parameter to `String#valid_encoding?`.

```ruby
str.valid_encoding?('euc-jp')  # => true or false
```

A patch is attached.

[*1] https://twitter.com/knu/status/676009662655934465 (in Japanese)

---Files--------------------------------
valid_encoding.patch (4.4 KB)


-- 
https://bugs.ruby-lang.org/