Issue #11158 has been updated by Franck Verrot.


Marc-Andre Lafortune wrote:
> Franck Verrot wrote:
> > Isn't there way to much overhead to include `Enumerable` in `Symbol`?
> 
> Not sure what you mean by overhead. There's no performance cost to it. Itadds a bunch of methods to `Symbol`, and many won't be helpful (I doubt someone would use `Symbol.map{...}`, but I 'm not sure I see the downside.

Sorry I haven't formulated this right :-) I was only wondering if including`Enumerable` in `Symbol` could lead some of us to rely on methods (like `map` as you said) that weren't really thought through at the time we introduced `each`. Maybe that doesn't make sense, so feel free to ignore this comment... still new to the Ruby VM internals and ways of designing its APIs :-)

Thanks!

----------------------------------------
Feature #11158: Introduce a Symbol.count API as a more efficient alternative to Symbol.all_symbols.size
https://bugs.ruby-lang.org/issues/11158#change-53079

* Author: Lourens Naud
* Status: Open
* Priority: Normal
* Assignee: Koichi Sasada
----------------------------------------
We're in the process of migrating a very large Rails codebase from a Ruby 2.1.6 runtime to Ruby 2.2.2 and as part of this migration process would liketo keep track of Symbol counts and Symbol GC efficiency in our metrics system. Preferably still while on 2.1 (however this implies a backport to 2.1 as well), but would definitely be useful in 2.2 as well.

Currently the recommended and only reliable way to get to the Symbol countsis via Symbol.all_symbols.size, which:

* Allocates an Array
* rb_ary_push and walking the symbol table isn't exactly efficient

Here's some benchmarks:

~~~
./miniruby -Ilib -rbenchmark -e "p Benchmark.measure { 10_000.times{ Symbol.count } }"
#<Benchmark::Tms:0x007f8bc208bdd0 @label="", @real=0.0011274919961579144, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.01, @total=0.01>
~~~

~~~
./miniruby -Ilib -rbenchmark -e "p Benchmark.measure { 10_000.times{ Symbol.all_symbols.size } }"
#<Benchmark::Tms:0x007fa47205a550 @label="", @real=0.3135859479953069, @cstime=0.0, @cutime=0.0, @stime=0.03, @utime=0.29, @total=0.31999999999999995>
~~~

I implemented and attached a patch for a simple Symbol.count API that just returns a numeric version of the symbol table size, without having to do any iteration.

Please let me know if this is inline with an expected core API, anything I could clean up further and if there's any possibility of such a change alsobeing backported to 2.1 as well? (happy to create a new patch for 2.1)

---Files--------------------------------
symbol_count.patch (4.4 KB)
symbol_enumerator.patch (6.07 KB)


-- 
https://bugs.ruby-lang.org/