Issue #15800 has been updated by ko1 (Koichi Sasada).


https://github.com/ruby/ruby/commit/a47f598d77ac97f9fe89fe16aa8bcab4fd262c16

----------------------------------------
Misc #15800: Reduce ONIG_NREGION from 10 to 4: power of 2 and testing revea=
led most pattern matches are less than or equal to 4 results
https://bugs.ruby-lang.org/issues/15800#change-80155

* Author: methodmissing (Lourens Naud=E9)
* Status: Closed
* Priority: Normal
* Assignee: =

----------------------------------------
References PR https://github.com/ruby/ruby/pull/2135 - it's a very small ch=
ange, but runnin due diligence past the list too for discussion.

I noticed `onig_region_resize` (called from `onig_region_copy`) would defau=
lt to allocating a `10 * 8` bytes block on 64bit for both the `beg` and `en=
d` members of `OnigRegion`.

Preliminary testing with Rails and the benchmark suite suggests that most p=
attern matches are `<=3D` 4 results.

#### Due diligence with debug counters

Few requests on a blank redmine instance:

```
[RUBY_DEBUG_COUNTER]	obj_match_under4              	         10650 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER]	obj_match_ge4                 	          1589 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER]	obj_match_ge8                 	            66
[RUBY_DEBUG_COUNTER]	obj_match_ptr                 	         12305
```

single match `1000000.times { 'haystack'.match(/hay/) }`

```
[RUBY_DEBUG_COUNTER]	obj_match_under4              	        999366 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER]	obj_match_ge4                 	           473 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER]	obj_match_ge8                 	             0
[RUBY_DEBUG_COUNTER]	obj_match_ptr                 	        999839
```

multiple matches `> 4` `1000000.times { /(.)(.)(\d+)(\d)/.match("THX1138.")=
 }`

```
[RUBY_DEBUG_COUNTER]	obj_match_under4              	           353 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER]	obj_match_ge4                 	        997579 <<<<<<<<=
<<
[RUBY_DEBUG_COUNTER]	obj_match_ge8                 	             0
[RUBY_DEBUG_COUNTER]	obj_match_ptr                 	        997932
```

#### Memory and ips benchmarks, MatchData specific

```
lourens@CarbonX1:~/src/ruby/ruby$ /usr/local/bin/ruby --disable=3Dgems -rru=
bygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver =
            --executables=3D"compare-ruby::~/src/ruby/trunk/ruby --disable=
=3Dgems -I.ext/common --disable-gem"             --executables=3D"built-rub=
y::./miniruby -I./lib -I. -I.ext/common  -r./prelude --disable-gem" -v --re=
peat-count=3D24 -r memory $(ls ./benchmark/*match*.{yml,rb} 2>/dev/null)
compare-ruby: ruby 2.7.0dev (2019-04-19 trunk 67619) [x86_64-linux]
built-ruby: ruby 2.7.0dev (2019-04-19 reduce-onig-de.. 67619) [x86_64-linux]
last_commit=3DReduce ONIG_NREGION from 10 to 4: power of 2 and testing reve=
aled most pattern matches are less than or equal to 4 results
Calculating -------------------------------------
                     compare-ruby  built-ruby =

           match_gt4      11.936M     11.600M bytes -       1.000 times
         match_small      11.848M     11.608M bytes -       1.000 times

Comparison:
                        match_gt4
          built-ruby:  11600000.0 bytes =

        compare-ruby:  11936000.0 bytes - 1.03x  larger

                      match_small
          built-ruby:  11608000.0 bytes =

        compare-ruby:  11848000.0 bytes - 1.02x  larger

lourens@CarbonX1:~/src/ruby/ruby$ /usr/local/bin/ruby --disable=3Dgems -rru=
bygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver =
            --executables=3D"compare-ruby::~/src/ruby/trunk/ruby --disable=
=3Dgems -I.ext/common --disable-gem"             --executables=3D"built-rub=
y::./miniruby -I./lib -I. -I.ext/common  -r./prelude --disable-gem" -v --re=
peat-count=3D24 -r ips $(ls ./benchmark/*match*.{yml,rb} 2>/dev/null)
compare-ruby: ruby 2.7.0dev (2019-04-19 trunk 67619) [x86_64-linux]
built-ruby: ruby 2.7.0dev (2019-04-19 reduce-onig-de.. 67619) [x86_64-linux]
last_commit=3DReduce ONIG_NREGION from 10 to 4: power of 2 and testing reve=
aled most pattern matches are less than or equal to 4 results
Calculating -------------------------------------
                     compare-ruby  built-ruby =

           match_gt4        1.664       1.754 i/s -       1.000 times in 0.=
600793s 0.570031s
         match_small        1.856       2.047 i/s -       1.000 times in 0.=
538838s 0.488407s

Comparison:
                        match_gt4
          built-ruby:         1.8 i/s =

        compare-ruby:         1.7 i/s - 1.05x  slower

                      match_small
          built-ruby:         2.0 i/s =

        compare-ruby:         1.9 i/s - 1.10x  slower
```

I am fine with removing the debug counters and committed them for now as it=
's easier for reviewers to also reproduce locally.

For additional context I noticed that character offsets are bounded by the =
`num_regs` member as per https://github.com/ruby/ruby/blob/trunk/re.c#L989-=
L1005 and therefore investigated converging `allocated` and `num_regs` to b=
e less divergent for the common cases

And some more of the 80 byte allocs from strscan with only the first chunk =
referenced:

```
=3D=3D24182=3D=3D -------------------- 283 of 1000 --------------------
=3D=3D24182=3D=3D max-live:    19,520 in 244 blocks
=3D=3D24182=3D=3D tot-alloc:   30,480 in 381 blocks (avg size 80.00)
=3D=3D24182=3D=3D deaths:      381, at avg age 423,950,747 (3.96% of prog l=
ifetime)
=3D=3D24182=3D=3D acc-ratios:  1.95 rd, 4.98 wr  (59,728 b-read, 151,920 b-=
written)
=3D=3D24182=3D=3D    at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_e=
xp-dhat-amd64-linux.so)
=3D=3D24182=3D=3D    by 0x2561E6: onig_region_resize (regexec.c:260)
=3D=3D24182=3D=3D    by 0x2561E6: onig_region_resize_clear (regexec.c:298)
=3D=3D24182=3D=3D    by 0x2561E6: onig_match (regexec.c:3882)
=3D=3D24182=3D=3D    by 0xA4C376B: strscan_do_scan (strscan.c:472)
=3D=3D24182=3D=3D    by 0xA4C376B: strscan_skip (strscan.c:570)
=3D=3D24182=3D=3D    by 0x2E5B4E: vm_call_cfunc_with_frame (vm_insnhelper.c=
:2207)
=3D=3D24182=3D=3D    by 0x2E5B4E: vm_call_cfunc (vm_insnhelper.c:2225)
=3D=3D24182=3D=3D =

=3D=3D24182=3D=3D Aggregated access counts by offset:
=3D=3D24182=3D=3D =

=3D=3D24182=3D=3D [   0]  26456 26456 26456 26456 26456 26456 26456 26456 0=
 0 0 0 0 0 0 0 =

=3D=3D24182=3D=3D [  16]  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 <<<<<<<<<<
=3D=3D24182=3D=3D [  32]  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 <<<<<<<<<< =

=3D=3D24182=3D=3D [  48]  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  <<<<<<<<<<
=3D=3D24182=3D=3D [  64]  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  <<<<<<<<<<
```



-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>