A slightly different idea, closer to the existing garbage collection:

The existing garbage collection is based on finding all pointers to 
locations that can possibly be heap locations. This is done by scanning 
the stack and all kinds of other locations that may keep heap pointers.

So to garbage collect symbols, one would scan all the locations that 
possibly might contain symbols. As far as I remember from Minero Aoki's 
black book, pointers, integers, and symbols (and true/false/nil) 
partition the space of 32-bit (or 64-bit) integers.

This scan would have to include all the Ruby-internal data structures 
that use symbols. As with the current "pessimistic" garbage collector, 
any 32-bit (or 64-bit) value that is found and is the same as an 
existing symbol would make that symbol non-garbage-collectible. If no 
such symbol is found, the symbol would be collected.

Anyway, this is just an idea, there may be quite a few downsides to it.

Regards,   Martin.


On 2013/02/06 23:21, SASADA Koichi wrote:

> One rough idea (but not verified) is:
>
> Separated Symbols into two sets:
>    (a) Symbols created by rb_intern()
>    (b) Symbols created from String object (String#to_sym)
>
> (a) is internal symbols which are used by the interpreter such as method
> names, attribute names and so on.
>
> (b) is mainly created by ruby program (and used for DoS attack).
>
> I think (hope) (b) can be collected at GC timing with some development
> efforts. But not touched. Sorry.
>
> PS. Of course, a program making symbols belong to (a) will be DoS
> attack. For example, the program makes many methods or attributes by
> untrusted data, it will be same problem (but I believe nobody makes such
> a bad program).
>