--mimepart_4c99a38235188_eedcdd796a12342 Content-Type: text/plain Content-Transfer-Encoding: Quoted-printable Content-Disposition: inline Bug #3855: String#rindex extremely slow on long UTF8 strings http://redmine.ruby-lang.org/issues/show/3855 Author: Michael Selig Status: Open, Priority: Normal ruby -v: ruby 1.9.3dev (2010-09-21 trunk 29308) [i686-linux] Not really a bug ..... I think this issue was raised a few months ago, but I have done a very simple patch to string.c to fix the problem. Example: ruby -e 'p String.new("XXX\u0639" + "X" * 100000).rindex("\u0639")' takes approx 2.7 secs on my old AMD Athlon system, but only approx 0.02 sec with the patch below. The problem is worst when the search string is either not found or is near the beginning of the string. The issue is the call to "str_nth()" which has to scan the string repeatedly on multibyte encodings just to locate where to start comparing. I hope that you will consider applying the patch. Mike ---------------------------------------- http://redmine.ruby-lang.org --mimepart_4c99a38235188_eedcdd796a12342 Content-Type: application/octet-stream; name=rindex.pat Content-Transfer-Encoding: Base64 Content-Disposition: attachment; filename=rindex.pat SW5kZXg6IHN0cmluZy5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIHN0 cmluZy5jCShyZXZpc2lvbiAyOTMxNSkKKysrIHN0cmluZy5jCSh3b3JraW5n IGNvcHkpCkBAIC0yNDg4LDE0ICsyNDg4LDE0IEBACiAgICAgZSA9IFJTVFJJ TkdfRU5EKHN0cik7CiAgICAgdCA9IFJTVFJJTkdfUFRSKHN1Yik7CiAgICAg c2xlbiA9IFJTVFJJTkdfTEVOKHN1Yik7Ci0gICAgZm9yICg7OykgewotCXMg PSBzdHJfbnRoKHNiZWcsIGUsIHBvcywgZW5jLCBzaW5nbGVieXRlKTsKLQlp ZiAoIXMpIHJldHVybiAtMTsKKyAgICBzID0gc3RyX250aChzYmVnLCBlLCBw b3MsIGVuYywgc2luZ2xlYnl0ZSk7CisgICAgd2hpbGUgKHMpIHsKIAlpZiAo bWVtY21wKHMsIHQsIHNsZW4pID09IDApIHsKIAkgICAgcmV0dXJuIHBvczsK IAl9CiAJaWYgKHBvcyA9PSAwKSBicmVhazsKIAlwb3MtLTsKKwlzID0gcmJf ZW5jX3ByZXZfY2hhcihzYmVnLCBzLCBlLCBlbmMpOwogICAgIH0KICAgICBy ZXR1cm4gLTE7CiB9Cg --mimepart_4c99a38235188_eedcdd796a12342--