--mimepart_4c99a38235188_eedcdd796a12342
Content-Type: text/plain
Content-Transfer-Encoding: Quoted-printable
Content-Disposition: inline

Bug #3855: String#rindex extremely slow on long UTF8 strings
http://redmine.ruby-lang.org/issues/show/3855

Author: Michael Selig
Status: Open, Priority: Normal
ruby -v: ruby 1.9.3dev (2010-09-21 trunk 29308) [i686-linux]

Not really a bug .....
I think this issue was raised a few months ago, but I have done a very simple patch to string.c to fix the problem.

Example:

ruby -e 'p String.new("XXX\u0639" + "X" * 100000).rindex("\u0639")'

takes approx 2.7 secs on my old AMD Athlon system, but only approx 0.02 sec with the patch below. The problem is worst when the search string is either not found or is near the beginning of the string.

The issue is the call to "str_nth()" which has to scan the string repeatedly on multibyte encodings just to locate where to start comparing.

I hope that you will consider applying the patch.

Mike


----------------------------------------
http://redmine.ruby-lang.org

--mimepart_4c99a38235188_eedcdd796a12342
Content-Type: application/octet-stream; name=rindex.pat
Content-Transfer-Encoding: Base64
Content-Disposition: attachment; filename=rindex.pat

SW5kZXg6IHN0cmluZy5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIHN0
cmluZy5jCShyZXZpc2lvbiAyOTMxNSkKKysrIHN0cmluZy5jCSh3b3JraW5n
IGNvcHkpCkBAIC0yNDg4LDE0ICsyNDg4LDE0IEBACiAgICAgZSA9IFJTVFJJ
TkdfRU5EKHN0cik7CiAgICAgdCA9IFJTVFJJTkdfUFRSKHN1Yik7CiAgICAg
c2xlbiA9IFJTVFJJTkdfTEVOKHN1Yik7Ci0gICAgZm9yICg7OykgewotCXMg
PSBzdHJfbnRoKHNiZWcsIGUsIHBvcywgZW5jLCBzaW5nbGVieXRlKTsKLQlp
ZiAoIXMpIHJldHVybiAtMTsKKyAgICBzID0gc3RyX250aChzYmVnLCBlLCBw
b3MsIGVuYywgc2luZ2xlYnl0ZSk7CisgICAgd2hpbGUgKHMpIHsKIAlpZiAo
bWVtY21wKHMsIHQsIHNsZW4pID09IDApIHsKIAkgICAgcmV0dXJuIHBvczsK
IAl9CiAJaWYgKHBvcyA9PSAwKSBicmVhazsKIAlwb3MtLTsKKwlzID0gcmJf
ZW5jX3ByZXZfY2hhcihzYmVnLCBzLCBlLCBlbmMpOwogICAgIH0KICAgICBy
ZXR1cm4gLTE7CiB9Cg
--mimepart_4c99a38235188_eedcdd796a12342--