--09dnVJi+bqm4ElSNMNw
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Mon, 2006-11-06 at 23:21 +0900, khaines / enigo.com wrote:
> So, what would you suggest as a performance optimization to Mutex?

Principally, pushing the whole thing into C and using a better data
structure than a dynamically allocated C array of VALUEs (what underlies
Array) for the wait queue.

That way, we can eliminate a whole slew of Ruby method calls and
variable lookups, as well as making all the important wait queue
operations O(1).  That complexity thing will probably matter less than
it should, since wait queues aren't normally that long, but eliminating
all those method calls should be a big win.

Here's a benchmark script which illustrates how things stand right now:

        require 'thread'
        require 'benchmark'
        
        class Mutex
          def noop
            begin
              nil
            ensure
              nil
            end
          end
        end
        
        n = 1_000_000
        m = Mutex.new
        
        Benchmark.bm do |x|
          x.report( "raw:" ) { n.times { nil } }
          x.report( "open-coded noop:" ) { n.times {
            begin
              nil
            ensure
              nil
            end
          } }
          x.report( "noop method:" ) { n.times { m.noop { nil } } }
          x.report( "open-coded critical:" ) { n.times {
            saved = Thread.critical
            begin
              Thread.critical = true
              nil
            ensure
              Thread.critical = saved
            end
          } }
          x.report( "exclusive:" ) { n.times { Thread.exclusive
        { nil } } }
          x.report( "open-coded lock:" ) { n.times {
            m.lock
            begin
              nil
            ensure
              m.unlock
            end
          } }
          x.report( "synchronize:" ) { n.times { m.synchronize
        { nil } } }
        end
        
And the results on one of my machines:

                          user     system      total        real
        raw:              1.450000   0.370000   1.820000 (  1.833878)
        open-coded noop:  2.700000   0.520000   3.220000 (  3.305178)
        noop method:      5.140000   1.280000   6.420000 (  6.590072)
        open-coded
        critical:         7.440000   0.830000   8.270000 (  8.518001)
        exclusive:       15.820000   2.880000  18.700000 ( 19.366240)
        open-coded lock: 21.840000   2.590000  24.430000 ( 25.227391)
        synchronize:     27.450000   3.490000  30.940000 ( 32.120969)
        
(Not a fast machine, evidently.)

But see, right now, the best choice for performance is open-coding an
ensure block with Thread.critical.  That's not what we want if
Thread.critical will be going away.

I'm reasonably confident that I can get the no-contention case for
Mutex#synchronize down to somewhere in-between "noop method" and
"open-coded critical".  At that point, Mutex#synchronize becomes the
most appealing option, making one less bump on the road when porting
code from 1.8 to 1.9.

-mental

--09dnVJi+bqm4ElSNMNw
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQBFT3L6SuZBmZzm14ERAl7GAKCbRDA5GY/G2mA28aNbe74DbiQ6MgCbBoCO
B8+nIB5Bx2+0bP8ajze8fn47d
-----END PGP SIGNATURE-----

--09dnVJi+bqm4ElSNMNw--