Issue #9262 has been updated by sam.saffron (Sam Saffron).


Before I measure the changes proposed by funny falcon I wanted to present the sad state of affairs.

I applied this patch to gather method cache statistics: https://github.com/SamSaffron/ruby/commit/e6ecf6d8390b2e4036b91778268a7544f364b8f1

I then wrote some middleware to measure the impact of the method cache while browsing around Discourse in production mode:

class CacheStatMiddleware
    def initialize(app, config={})
      @app = app
    end

    def call(env)
      start = Time.new
      hits = Kernel.cache_hits
      misses = Kernel.cache_misses
      time = Kernel.cache_miss_time

      r = @app.call(env)
      lost = Kernel.cache_miss_time - time
      total = ((Time.new - start)* 1000 * 1000).to_i

      pct = (lost.to_f / total.to_f) * 100

      puts "hits #{Kernel.cache_hits - hits} " <<
           "misses #{Kernel.cache_misses - misses } " <<
           "microsecs lost #{lost} #{pct.round(2)}%  " <<
           "cache usage #{Kernel.cache_entries} / #{Kernel.cache_size} " <<
           "req duration #{total}  "

      r
    end
  end

  config.middleware.insert 0, CacheStatMiddleware


---- 


Here are some results:

hits 74383 misses 5095 microsecs lost 4620 7.75%  cache usage 2048 / 2048 req duration 59613  
hits 74033 misses 5093 microsecs lost 4750 8.04%  cache usage 2048 / 2048 req duration 59070  
hits 73962 misses 5106 microsecs lost 4509 7.74%  cache usage 2048 / 2048 req duration 58253  
hits 677 misses 229 microsecs lost 393 6.23%  cache usage 2048 / 2048 req duration 6305  
hits 523 misses 35 microsecs lost 211 3.68%  cache usage 2048 / 2048 req duration 5738  
hits 549 misses 31 microsecs lost 279 3.58%  cache usage 2048 / 2048 req duration 7804  
hits 4412 misses 143 microsecs lost 97 3.14%  cache usage 2048 / 2048 req duration 3088  
hits 47 misses 18 microsecs lost 21 8.97%  cache usage 2048 / 2048 req duration 234  
hits 3405 misses 130 microsecs lost 152 4.88%  cache usage 2048 / 2048 req duration 3115  
hits 519 misses 158 microsecs lost 214 8.1%  cache usage 2048 / 2048 req duration 2641  
hits 114133 misses 14584 microsecs lost 14338 9.31%  cache usage 2048 / 2048 req duration 154000  
hits 44835 misses 2754 microsecs lost 2850 8.16%  cache usage 2048 / 2048 req duration 34914  
hits 75084 misses 8601 microsecs lost 8048 10.95%  cache usage 2048 / 2048 req duration 73475  


--- 

Conclusion: 

For an app like Discourse 3-10% of request time is occupied looking up methods, due to cache inefficiency.

Out of the box cache is just too small to fit all the entries required, so its thrashing. 


---

If we raise the method cache x16 size we see:

hits 720 misses 5 microsecs lost 11 0.44%  cache usage 17045 / 32768 req duration 2509  
hits 40 misses 0 microsecs lost 0 0.0%  cache usage 17061 / 32768 req duration 242  
hits 53076 misses 266 microsecs lost 685 1.67%  cache usage 17090 / 32768 req duration 41001  
hits 47234 misses 238 microsecs lost 315 1.24%  cache usage 17117 / 32768 req duration 25418  
hits 53012 misses 281 microsecs lost 850 2.03%  cache usage 17145 / 32768 req duration 41803  
hits 720 misses 5 microsecs lost 15 0.65%  cache usage 17163 / 32768 req duration 2308  
hits 52353 misses 261 microsecs lost 605 1.63%  cache usage 17190 / 32768 req duration 37031  
hits 46785 misses 190 microsecs lost 429 1.32%  cache usage 17207 / 32768 req duration 32392  
hits 52893 misses 273 microsecs lost 610 1.71%  cache usage 17240 / 32768 req duration 35571  
hits 720 misses 5 microsecs lost 7 0.45%  cache usage 17258 / 32768 req duration 1551  
hits 52295 misses 261 microsecs lost 709 1.75%  cache usage 17287 / 32768 req duration 40552  
hits 46784 misses 191 microsecs lost 393 1.35%  cache usage 17308 / 32768 req duration 29146  
hits 86409 misses 2013 microsecs lost 2655 3.4%  cache usage 17829 / 32768 req duration 78033  
hits 644 misses 7 microsecs lost 22 0.64%  cache usage 17843 / 32768 req duration 3458  
hits 46729 misses 295 microsecs lost 423 1.73%  cache usage 17864 / 32768 req duration 24385  
hits 82777 misses 893 microsecs lost 1807 2.57%  cache usage 17972 / 32768 req duration 70341  
hits 46819 misses 191 microsecs lost 400 1.18%  cache usage 17989 / 32768 req duration 33862  
hits 82412 misses 865 microsecs lost 1426 2.12%  cache usage 18099 / 32768 req duration 67163  


---

Even when the cache has room, too many collisions are happening forcing cache eviction when uneeded, leave 2% of the perf on the table, this is significantly better than 2-10% though. 

This can be fixed possibly by introducing quadratic probing or perhaps a cuckoo like algorithm. 

The global method cache has a couple of advantages over per class cache and a few disadvantage.

Advantages:

1. Size is cleanly capped
2. Very easy to gather stats and info about it
3. Expiry semantics are fairly clean, can force a method_state increment when utilization is too high

Disadvantages: 

1. Extra hashing required to look up elements, you need the klass hash
2. Extra storage required (klass_seq stored per element) 






----------------------------------------
Bug #9262: global_method_cache should be configurable or grow automatically
https://bugs.ruby-lang.org/issues/9262#change-43837

Author: tmm1 (Aman Gupta)
Status: Open
Priority: Normal
Assignee: ko1 (Koichi Sasada)
Category: 
Target version: next minor
ruby -v: trunk
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


The global_method_cache is currently a fixed 2048 entries. This is not configurable, and can be inadequate for large applications with thousands of classes and methods.

In our app, increasing the size of the cache to 32768 entries reduced time spent in search_method and overall pressure on st_lookup:

before
      420  14.0% st_lookup
      182   6.1% vm_search_method (inline)

after
      265   9.5% st_lookup
      125   4.5% vm_search_method (inline)

It's possible the VM could grow global_method_cache automatically, using some heuristic based on the number of long-lived classes or method_entries that are defined.
However this would break the hashing in the current implementation.


-- 
http://bugs.ruby-lang.org/