Issue #14363 has been reported by sos4nt (Stefan Schler).

----------------------------------------
Bug #14363: each_grapheme_cluster.size returns the wrong size
https://bugs.ruby-lang.org/issues/14363

* Author: sos4nt (Stefan Schler)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin15]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
Ruby 2.5 adds `String#each_grapheme_cluster` to enumerate the string's grapheme clusters:

```ruby
str = "a\u0300i\u0301"          #=> "ai"
str.each_grapheme_cluster.to_a  #=> ["a", "i"]
```

Unfortunately, the enumerator's `size` doesn't work as expected:

```ruby
str.each_grapheme_cluster.size  #=> 4
```

The source code reveals that it invokes `rb_str_each_char_size`, so it is equivalent to `each_char.size`:

```c
static VALUE
rb_str_each_grapheme_cluster(VALUE str)
{
    RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size);
    return rb_str_enumerate_grapheme_clusters(str, 0);
}
```

If the grapheme enumerator's size cannot be calculated lazily, `each_grapheme_cluster.size` should return `nil` to indicate that.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>