Issue #13220 has been updated by Martin Drst.


Nobuyoshi Nakada wrote:
> Note that these results are in NFD.
> It seems to result as expected by using NFC.

This is mostly true, but there are 'visual' characters that cannot be expressed in a single code point in Unicode. As an example: "q".unicode_normalize.gsub("q", "x") # => "x"
(The "q" may show with the two dots above the q or after them depending on the font and rendering engine used by your browser or mailer; in my case, the dots appear after, but the cursor moves across the q and the dots with a single key press.)

For many of the tests, applying them to grapheme clusters might work, but there may be languages where it won't be that easy.

Also, I don't understand why the author expects "a" for "a".next, but is happy for  	"a".upto("c").to_a to cycle through ["a", "b", "c"]. Here, the expectations seem to be inconsistent, but it also has to be said that e.g. Swedes would expect "a".next to be "" (see https://en.wikipedia.org/wiki/Swedish_alphabet).

----------------------------------------
Bug #13220: Enhance support of Unicode strings manipulation
https://bugs.ruby-lang.org/issues/13220#change-63227

* Author: Radovan Smitala
* Status: Feedback
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-darwin16]
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
Hi,

last days, Starr Horne posted very interesting testing results about manipulation unicode strings in Ruby 2.4.
And many methods doesn't work as excepted.

Article:

http://blog.honeybadger.io/ruby-s-unicode-support/



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>