Issue #15806 has been reported by methodmissing (Lourens Naud=E9). ---------------------------------------- Misc #15806: Explicitly initialise encodings on init to remove branches on = encoding lookup https://bugs.ruby-lang.org/issues/15806 * Author: methodmissing (Lourens Naud=E9) * Status: Open * Priority: Normal * Assignee: = ---------------------------------------- References Github PR https://github.com/ruby/ruby/pull/2128 I noticed that the encoding table is loaded on startup of even just `miniru= by` (minimal viable interpreter use case) through this backtrace during rub= y setup: ``` /home/lourens/src/ruby/ruby/miniruby(rb_enc_init+0x12) [0x56197b0c0c72] enc= oding.c:587 /home/lourens/src/ruby/ruby/miniruby(rb_usascii_encoding+0x1a) [0x56197b0c9= 48a] encoding.c:1357 /home/lourens/src/ruby/ruby/miniruby(Init_sym+0x7a) [0x56197b24810a] symbol= .c:42 /home/lourens/src/ruby/ruby/miniruby(rb_call_inits+0x1d) [0x56197b11afed] i= nits.c:25 /home/lourens/src/ruby/ruby/miniruby(ruby_setup+0xf6) [0x56197b0ec9d6] eval= .c:74 /home/lourens/src/ruby/ruby/miniruby(ruby_init+0x9) [0x56197b0eca39] eval.c= :91 /home/lourens/src/ruby/ruby/miniruby(main+0x5a) [0x56197b051a2a] ./main.c:41 ``` Therefore I think it makes sense to instead initialize encodings explicitly= just prior to symbol init, which is the first entry point into the interpr= eter loading that currently triggers `rb_enc_init` and remove the initializ= ation check branches from the various lookup methods. Some of the branches collapsed, `cachegrind` output, columns are `Ir Bc Bcm= Bi Bim` with `Ir` (instructions retired), `Bc` (branches taken) and `Bcm`= (branches missed) relevant here as there are no indirect branches (functio= n pointers etc.): (hot function, many instructions retired and branches taken and missed) ``` . . . . . rb_encoding * . . . . . rb_enc_from_index(int ind= ex) 835,669 0 0 0 0 { 13,133,536 6,337,652 50,267 0 0 if (!enc_table.list) { 3 0 0 0 0 rb_enc_init(); . . . . . } 23,499,349 8,006,202 293,161 0 0 if (index < 0 || enc_= table.count <=3D (index &=3D ENC_INDEX_MASK)) { . . . . . return 0; . . . . . } 30,024,494 0 0 0 0 return enc_table.list= [index].enc; 1,671,338 0 0 0 0 } ``` (cold function, representative of the utf8 variant more or less too) ``` . . . . . rb_encoding * . . . . . rb_ascii8bit_encoding(voi= d) . . . . . { 27,702 9,235 955 0 0 if (!enc_table.list) { . . . . . rb_enc_init(); . . . . . } 9,238 0 0 0 0 return enc_table.list= [ENCINDEX_ASCII].enc; 9,232 0 0 0 0 } ``` I think lazy loading encodings and populating the table is fine, but initia= lizing it can be done more explicitly in the boot process. -- = https://bugs.ruby-lang.org/ Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>