Issue #14363 has been reported by sos4nt (Stefan Schler).
----------------------------------------
Bug #14363: each_grapheme_cluster.size returns the wrong size
https://bugs.ruby-lang.org/issues/14363
* Author: sos4nt (Stefan Schler)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin15]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
Ruby 2.5 adds `String#each_grapheme_cluster` to enumerate the string's grapheme clusters:
```ruby
str = "a\u0300i\u0301" #=> "ai"
str.each_grapheme_cluster.to_a #=> ["a", "i"]
```
Unfortunately, the enumerator's `size` doesn't work as expected:
```ruby
str.each_grapheme_cluster.size #=> 4
```
The source code reveals that it invokes `rb_str_each_char_size`, so it is equivalent to `each_char.size`:
```c
static VALUE
rb_str_each_grapheme_cluster(VALUE str)
{
RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);
return rb_str_enumerate_grapheme_clusters(str, 0);
}
```
If the grapheme enumerator's size cannot be calculated lazily, `each_grapheme_cluster.size` should return `nil` to indicate that.
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>