On Thu, Apr 14, 2011 at 12:55 AM, Clifford Heath <no / spam.please.net> wrote: > I presented results from MRI 1.8.7, JRuby 1.6.0, and Rubinius, > and showed that they all had different shortcuts, and none > reliably kept the Hash contract (of using eql? and hash). > I.e. you can't rely on sensible code working the same in MRI > and JRuby. I posted results on JRuby master (1.6.1). You responded to that email with: "That result is reasonable, using MRI. Interesting, and nice, that it figures out it doesn't need to call Fixnum#hash to be able to choose a bucket. (BTW, that *uncommon* case means there's a test and branch which is superfluous and excessively costly for the more common case, according to arguments you've made against detecting monkey patching) If you do the same thing with JRuby however, the first test contains this: Looking up using Integer: nil" I'm not sure you actually looked at my results. > I think that's a problem. If you don't, then I'm done... It may be a problem, or it may not. When running your example, however, it seems to call your monkeypatched code in many places where you claim it doesn't. So I'm still confused. I know we don't call hash/eql? in all cases, but I'm trying to quantify what the correct behavior should be and what that behavior would cost. I'd be happy to continue discussing this as a JRuby issue. Would you file something at http://bugs.jruby.org with expected and actual JRuby 1.6.1 results? > Unless you care to point me to the place in the JRuby code > where this shortcut occurs (where I could make a change to > make it invisible), and a performance benchmark that would > show the effect of doing so. Then I'll happily make the > experiment to see whether I'm right (and the shortcut can > be made invisible without affecting performance measurably, > i.e. above the noise level of the benchmark). If I'm not, > I'll openly admit I was wrong... but I've done some pretty > hardcore optimizing in machine code before, and I think I > can win this one. I think you are underestimating the cost of performing a dynamic call. Even in an optimizing VM (like JRuby/JVM) there's a much higher cost for a dynamic call to "hash" than to just check that it's a Fixnum and branch to custom logic. *Way* higher cost. Here's stock JRuby 1.6.1, which isn't dispatching to "hash" for Fixnums: ~/projects/jruby jruby --server -rbenchmark -e "5.times { h = {}; h[1000] = 1000; puts Benchmark.measure { 10_000_000.times { h[1000] } } }" 0.908000 0.000000 0.908000 ( 0.859000) 0.623000 0.000000 0.623000 ( 0.622000) 0.699000 0.000000 0.699000 ( 0.699000) 0.747000 0.000000 0.747000 ( 0.747000) 0.753000 0.000000 0.753000 ( 0.753000) Here's the same benchmark, dispatching to "hash" through a per-class cache (faster than typical call-site caching, roughly on par with inlined calls): ~/projects/jruby jruby --server -rbenchmark -e "5.times { h = {}; h[1000] = 1000; puts Benchmark.measure { 10_000_000.times { h[1000] } } }" 1.634000 0.000000 1.634000 ( 1.580000) 1.297000 0.000000 1.297000 ( 1.297000) 1.356000 0.000000 1.356000 ( 1.355000) 1.334000 0.000000 1.334000 ( 1.334000) 1.343000 0.000000 1.343000 ( 1.344000) Now using an even faster check, that only dispatches to "hash" if the object is a Fixnum and the Fixnum class has not been reopened. Notice it's faster than full dyncall, but still a good bit slower than the fast path: ~/projects/jruby jruby --server -rbenchmark -e "5.times { h = {}; h[1000] = 1000; puts Benchmark.measure { 10_000_000.times { h[1000] } } }" 1.057000 0.000000 1.057000 ( 1.014000) 0.977000 0.000000 0.977000 ( 0.976000) 0.885000 0.000000 0.885000 ( 0.885000) 0.903000 0.000000 0.903000 ( 0.903000) 0.871000 0.000000 0.871000 ( 0.871000) And 1.9.2 to compare: ~/projects/jruby ruby1.9 -rbenchmark -e "5.times { h = {}; h[1000] = 1000; puts Benchmark.measure { 10_000_000.times { h[1000] } } }" 1.270000 0.010000 1.280000 ( 1.313950) 1.270000 0.010000 1.280000 ( 1.307163) 1.270000 0.000000 1.270000 ( 1.295588) 1.260000 0.010000 1.270000 ( 1.285083) 1.260000 0.010000 1.270000 ( 1.307108) Bottom line is that *any* additional branching logic will add overhead, and full dynamic calling introduces even more overhead on just about any implementation. Whether that's a fair trade-off is not for me to decide ;) - Charlie