Issue #12744 has been updated by Bouke van der Bijl.


Shyouhei Urabe wrote:
> I doubt if we can make a reverse_each_char which is faster than reverse.each_char.  It ls not always clear where is a boundary between a character and another, especially when scanning backwards.  We might end up scanning whole string from the beginning, splitting characters into separate substrings, then iterate over them.

Not sure why you think we can't make it faster than `reverse.each_char`, I've already implemented it and attached the patch. It uses `rb_enc_left_char_head`, which is implemented by all the encodings to scan a string backwards. 

For the most common encoding (UTF8) it is always possible to scan a string backwards from any point, and looking at the other encodings implemented in Ruby it seems only gb18030 has a stateful way to back up to previous characters, so iterating backwards over that one could end up being O(N^2). 

----------------------------------------
Feature #12744: Add str.reverse_each_char and str.reverse_chars
https://bugs.ruby-lang.org/issues/12744#change-60495

* Author: Bouke van der Bijl
* Status: Feedback
* Priority: Normal
* Assignee: 
----------------------------------------
This patch adds `str.reverse_each` and `str.reverse_chars`. It's currently not really possible to iterate a Ruby string in reverse while guaranteeing that you're not accidentally introducing an O(N^2) bug, without encoding to a fixed-length encoding like UTF-32. This is because variable-length encodings like UTF-8 requiring iterating over the whole string if you want to address characters by index.

The patch uses `rb_enc_left_char_head` to iterate over the string in reverse, so you can do so without allocating more memory.

---Files--------------------------------
add-reverse-string-iteration.patch (5.91 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>