On Thu, Dec 29, 2005 at 07:30:44AM +0900, Jim Weirich wrote:
> Jim Weirich wrote:
> Why are symbols not garbage collected?  Because a symbol represents a 
> mapping from a string name to a unique object.  Anytime in the execution 
> of the program, if that name is used for a symbol, the original symbol 
> object must be returned.  If the symbol is garbage collected, then a 
> later reference to the symbol name will return a different object. 
> That's generally frowned upon (although I don't really see the harm.  If 
> the original symbol was GC'ed, nobody cared what the original object was 
> anyways.  But that's the way it works).

Keep in mind that symbols are immediate values backed by
the global_symbols table (actually global_symbols.tbl and
global_symbols.rev, for the name => id and id => name associations
respectively). Since the lower bits encode information like ID_LOCAL,
ID_INSTANCE, etc., symbol values cannot point to the appropriate entry
in global_symbols the same way VALUEs for normal objects point to RVALUE
slots in the object heaps. [1]

During the mark phase, the stack must be scanned for references to live
objects. It's easy to verify if an unsigned (long) long seems to point
to an object, by seeing if the address falls into the area occupied by
some heap and actually corresponds to a slot. In order to mark symbol entries
in global_symbols, a lookup in global_symbols.rev would be needed for each
word in the stack. I conjecture that this would be relatively expensive, but
there are probably better reasons for the current behavior (it's too late to
read the sources in detail though :)...

[1] Even if those bits were not used, another level of indirection would
be needed due to the way the hash table works.

-- 
Mauricio Fernandez