Issue #16160 has been updated by methodmissing (Lourens Naud=E9).


nobu (Nobuyoshi Nakada) wrote:
> I'm positive about this, except for the performance.
> Do you have any numbers?

Apologies for the delay in replying.

Using `benchmark-driver` script (running set last as it would taint the oth=
ers by initializing the locals table on the thread object initialized in pr=
elude):

```
prelude: |
  th =3D Thread.new {}
benchmark:
  thread_variable_get: th.thread_variable_get('foo')
  thread_variables: th.thread_variables
  thread_variable_p: th.thread_variable?('foo')
  thread_variable_set: th.thread_variable_set('foo', 'bar')
loop_count: 1000000
```

```
lourens@CarbonX1:~/src/ruby/ruby$ /usr/local/bin/ruby --disable=3Dgems -rru=
bygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver =
            --executables=3D"compare-ruby::~/src/ruby/trunk/ruby --disable=
=3Dgems -I.ext/common --disable-gem"             --executables=3D"built-rub=
y::./miniruby -I./lib -I. -I.ext/common  -r./prelude --disable-gem" -v --re=
peat-count=3D10 $HOME/src/lazy_init_thread_locals.yml
compare-ruby: ruby 2.7.0dev (2019-09-22T01:11:51Z master a0ce0b6297) [x86_6=
4-linux]
built-ruby: ruby 2.7.0dev (2019-09-22T01:21:06Z lazy-init-thread-l.. 24463b=
7252) [x86_64-linux]
Calculating -------------------------------------
                     compare-ruby  built-ruby =

 thread_variable_get      11.305M     33.901M i/s -      1.000M times in 0.=
088456s 0.029498s
    thread_variables      22.765M     40.344M i/s -      1.000M times in 0.=
043927s 0.024787s
   thread_variable_p      19.260M     20.883M i/s -      1.000M times in 0.=
051921s 0.047886s
 thread_variable_set       8.195M      8.543M i/s -      1.000M times in 0.=
122030s 0.117054s

Comparison:
              thread_variable_get
          built-ruby:  33900806.8 i/s =

        compare-ruby:  11305022.7 i/s - 3.00x  slower

                 thread_variables
          built-ruby:  40344251.1 i/s =

        compare-ruby:  22765106.8 i/s - 1.77x  slower

                thread_variable_p
          built-ruby:  20882884.5 i/s =

        compare-ruby:  19260142.8 i/s - 1.08x  slower

              thread_variable_set
          built-ruby:   8543090.8 i/s =

        compare-ruby:   8194725.4 i/s - 1.04x  slower
```

A regression on `thread_variable_set`, but improvement on others.

And with memory runner (although knowing ahead of time it's just the hash, =
40 bytes with array table saved):

```
lourens@CarbonX1:~/src/ruby/ruby$ /usr/local/bin/ruby --disable=3Dgems -rru=
bygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver =
            --executables=3D"compare-ruby::~/src/ruby/trunk/ruby --disable=
=3Dgems -I.ext/common --disable-gem"             --executables=3D"built-rub=
y::./miniruby -I./lib -I. -I.ext/common  -r./prelude --disable-gem" -v --re=
peat-count=3D10 -r memory $HOME/src/lazy_init_thread_locals.yml
compare-ruby: ruby 2.7.0dev (2019-09-22T01:11:51Z master a0ce0b6297) [x86_6=
4-linux]
built-ruby: ruby 2.7.0dev (2019-09-22T01:21:06Z lazy-init-thread-l.. 24463b=
7252) [x86_64-linux]
Calculating -------------------------------------
                     compare-ruby  built-ruby =

 thread_variable_get      11.632M     11.528M bytes -      1.000M times
    thread_variables      11.668M     11.472M bytes -      1.000M times
   thread_variable_p      11.692M     11.452M bytes -      1.000M times
 thread_variable_set      11.652M     11.568M bytes -      1.000M times

Comparison:
              thread_variable_get
          built-ruby:  11528000.0 bytes =

        compare-ruby:  11632000.0 bytes - 1.01x  larger

                 thread_variables
          built-ruby:  11472000.0 bytes =

        compare-ruby:  11668000.0 bytes - 1.02x  larger

                thread_variable_p
          built-ruby:  11452000.0 bytes =

        compare-ruby:  11692000.0 bytes - 1.02x  larger

              thread_variable_set
          built-ruby:  11568000.0 bytes =

        compare-ruby:  11652000.0 bytes - 1.01x  larger
```

----------------------------------------
Misc #16160: Lazy init thread local storage
https://bugs.ruby-lang.org/issues/16160#change-81658

* Author: methodmissing (Lourens Naud=E9)
* Status: Open
* Priority: Normal
* Assignee: =

----------------------------------------
References PR https://github.com/ruby/ruby/pull/2295

### Why?

The `local_storage` member of execution context is lazy initialized and dri=
ves the `Thread#[]` and `Thread#[]=3D` APIs, which are Fiber local and not =
Thread local storage. I think the same lazy init pattern should be applied =
to the APIs below as well - reduces one `Hash` alloc per thread created tha=
t does not use thread locals.

### Lazy allocates thread local storage for the following APIs

*  `Thread#thread_variable_get` - early returns `nil` on locals Hash not in=
itialised
*  `Thread#thread_variable_set` - forces allocation of the locals Hash if n=
ot initilalised
*  `Thread#thread_variables` - early returns the empty array AND saves on H=
ash iteration if locals Hash not initialised
*  `Thread#thread_variable?` - early returns `false` on locals Hash not ini=
tialised

### Other notes

* Moved initial implementation from `internal.h` to `thread.c` local to cal=
l sites.
* Preferred `defs/id.def` for the `locals` ID (seeing this pattern used mor=
e often, but not sure if that is preferred to inline `rb_intern` yet. Eithe=
r way there's quite a few different conventions around IDs in the codebase =
at the moment and happy to help converging to a standard instead.
* Maybe a flag is overkill and `NIL_P` on `locals` ivar could also work ...

Thoughts?



-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>