On Mon, 24 Nov 2008 14:39:45 +0900, Ryan Davis wrote:

> On Nov 23, 2008, at 01:11 , Kurt Stephens wrote:
> 
>> http://kurtstephens.com/node/72
>>
>> Evaluating :°∆foo_bar°« is 1.5 to 2.5 times faster than :foo_bar. This is
>> true for both Ruby 1.8.6-p287 and Ruby 1.9 trunk.  The most common
>> means
>> of specifying a Symbol literal is slower than the general expression to
>> create a Symbol from a String constant.
> 
> for at least 1.8 (and I highly suspect 1.9 tho I'm not checking), there
> is literally no difference between :blah, :'blah', and :"blah" to the
> parser and therefor the eval loop. You can see that via parse_tree:
> 
>> % parse_tree_show -u
>> a = :blah, :'blah', :"blah"
>> s(:lasgn,
>>  :a,
>>  s(:svalue, s(:array, s(:lit, :blah), s(:lit, :blah),
>> s(:lit, :blah))))
> 
> Likewise, your case with interpolation:
> 
>> % parse_tree_show -u
>> :"foo_#{'bar'}"
>> s(:lit, :foo_bar)
> 
> You could assign a local variable with 'bar' and do that instead for a
> more fair comparison:
> 
>> % parse_tree_show -u
>> bar = "bar"
>> :"foo_#{bar}"
>> s(:block,
>>  s(:lasgn, :bar, s(:str, "bar")),
>>  s(:dsym, "foo_", s(:evstr, s(:lvar, :bar))))
> 
> Finally, you NEED to add a null benchmark to your measurements. When
> your benchmarks are coming that close to the null benchmark, you have to
> suspect that your experiment is flawed or your sample size is too small.
> I'm getting measurements at 1m samples that are only 50% more than null.
> And as I increase the samples, the number converges DOWN towards null.
> At 5m it is only 40% more.

Frankly I'm surprised that the simpler tests aren't intentionally equal 
to the null benchmark. The parser should be able to handle the simple 
cases for you so that all the work is done once, before you execute the 
first line of code. (And this is the behavior I see on ruby 1.9)

>> # of iterations = 5000000
>>                                user     system      total        real
>> null_time                  0.810000   0.010000   0.820000 (  0.843928)
>> :foo_bar                   1.140000   0.010000   1.150000 (  1.167343)
>> :'foo_bar'                 1.170000   0.020000   1.190000 (  1.256334)
>> :"foo_bar"                 1.110000   0.010000   1.120000 (  1.191425)
>> :"foo_#{'bar'}"            1.130000   0.010000   1.140000 (  1.164772)
>> "foo_#{'bar'}".to_sym      2.450000   0.200000   2.650000 (  2.702624)
>> "foo_bar".to_sym           2.530000   0.180000   2.710000 (  2.770743)
>> :"foo_#{bar}"              4.300000   0.430000   4.730000 (  7.484217)
>> "foo_#{bar}".to_sym        5.040000   0.170000   5.210000 (  5.493593)
> 
> What this is really saying to me, barring the GC issue you found, is
> that the numbers for plain symbols don't really matter at all. They're
> about as low as they're gonna get.
> 
> I do find it interesting that :"foo_#{bar}" is currently 40% slower than
> "foo_#{bar}".to_sym... I suspect there might be a minor issue there. But
> for the rest? *shrug* seems acceptable to me.

I'm not seeing this on 1.8 or 1.9. It must be the GC kicking in on your 
system, in the middle of this benchmark, because that benchmark happens 
to be the straw that broke the camel's back. In which case, GC isn't an 
issue either -- just a totally appropriate behavior that happens to be 
kicking in at just the wrong time. To deal with this, either run the 
tests in a random order and average over several runs or run GC.start 
before each benchmark.

Here's my Ruby 1.9 tests, run for 10_000_000 iterations, seem to support 
the reasonable conclusions various people have mentioned.

                           user     system      total        real
null                   1.460000   0.000000   1.460000 (  1.644902)
:foo_bar               1.480000   0.000000   1.480000 (  1.654968)
:'foo_bar'             1.440000   0.000000   1.440000 (  1.594285)
:"foo_bar"             1.450000   0.000000   1.450000 (  1.651050)
"foo_bar".to_sym       8.640000   0.060000   8.700000 (  9.806916)
:"foo_#{'bar'}"        1.440000   0.000000   1.440000 (  1.586956)
"foo_#{'bar'}".to_sym  8.810000   0.050000   8.860000 ( 10.161624)
:"foo_#{bar}"         14.080000   0.040000  14.120000 ( 16.065701)
"foo_#{bar}".to_sym   13.980000   0.040000  14.020000 ( 15.937608)

On Ruby 1.8, the null loop is faster than one that refers to a symbol 
constant, and I don't know why. Is the parser not doing the work, or is 
it just time taken to visit an additional AST node and return a return 
value?

                           user     system      total        real
null                   1.770000   0.000000   1.770000 (  1.858659)
:foo_bar               4.240000   0.900000   5.140000 (  5.501225)
:'foo_bar'             4.260000   0.880000   5.140000 (  5.498005)
:"foo_bar"             4.320000   0.790000   5.110000 (  5.425099)
"foo_bar".to_sym      11.150000   0.970000  12.120000 ( 13.336549)
:"foo_#{'bar'}"        4.300000   0.830000   5.130000 (  5.541806)
"foo_#{'bar'}".to_sym 11.120000   1.030000  12.150000 ( 13.236672)
:"foo_#{bar}"         15.180000   1.060000  16.240000 ( 18.203953)
"foo_#{bar}".to_sym   17.560000   0.990000  18.550000 ( 20.952985)


(Hot tip: the number that you pass to Benchmark.bm isn't the number of 
tests, it's the width of the labels. There's no need for the "\t\t" stuff 
that you had in the tests.)

require 'benchmark'

n = 10_000_000

Benchmark.bm('"foo_#{\'bar\'}".to_sym'.size) { | x |
  GC.start
  x.report("null") do
    n.times do
    end
  end
  
  GC.start

  x.report(':foo_bar') {
    n.times { 
      :foo_bar
    }
  }
  GC.start
  x.report(":'foo_bar'") {
    n.times { 
      :'foo_bar'
    }
  }
  GC.start
  x.report(':"foo_bar"') {
    n.times { 
      :"foo_bar"
    }
  }
  GC.start
  x.report('"foo_bar".to_sym') {
    n.times { 
      "foo_bar".to_sym
    }
  }
  GC.start
  x.report(':"foo_#{\'bar\'}"') { 
    n.times { 
      :"foo_#{'bar'}"
    }
  }
  GC.start
  x.report('"foo_#{\'bar\'}".to_sym') {
    n.times {
      "foo_#{'bar'}".to_sym
    }
  }
  GC.start
  x.report(':"foo_#{bar}"') do
    bar="bar"
    n.times do
      :"foo_#{bar}"
    end
  end
  GC.start
  x.report('"foo_#{bar}".to_sym') do
    bar="bar"
    n.times do
      "foo_#{bar}".to_sym
    end
  end
}


-- 
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/