Issue #5964 has been updated by Kurt  Stephens.


Joshua Ballanco wrote:
> But leaving all of that aside, you have to consider that Symbols are *never* collected. This is required for their semantics. 

True, CRuby Symbols are not collected.  However, in general, this is not required for every implementation of "symbols".  There is an open bug to make CRuby Symbols collectable, but it will require C API changes.  What semantics prevent Ruby Symbols from being collected?  

Why should Strings and Symbols be distinct?  Try adding the following "convenience" and watch what happens to performance and related semantics; I have seen this code in the wild:

  class Symbol; def ==(other); self.to_s == other.to_s; end

Could Symbols behave more like Strings?  Sure.  I wish Symbol#<=>(Symbol) existed, and for some String-related operations, Symbols on the right-hand side might be automatically coerced.

But to conflate the identity properties of the Symbols and Strings would be a mistake.  Those who think Symbols and Strings should be  one-and-the-same may not understand the benefits of their differences: a Symbol's representation is its identity and its representation is immutable; a String does not have these properties.

Symbols represent concepts, Strings represent data.  Their differences in identity helps maintain that distinction and have important performance and semantic implications.


----------------------------------------
Feature #5964: Make Symbols an Alternate Syntax for Strings
https://bugs.ruby-lang.org/issues/5964

Author: Tom Wardrop
Status: Open
Priority: Normal
Assignee: 
Category: 
Target version: 


Or, to put it another way, make Symbols and Strings one and the same - interchangeable and indistinguishable from the other.

This may seem odd given that the whole point of Symbols is to have a semantically different version of a string that allows you to derive different meaning from the same data, but I think we have to compare the reason why Symbol's exist and the benefits they offer, to how their use has evolved and the problems that has introduced. There are a few main points I wish to make to begin what could be a lengthy discussion.

(1) Most people use Symbols because they're prettier and quicker to type than strings. A classic example of this is demonstrated with the increasing use of the short-hand Hash syntax, e.g. {key: 'value'}. This syntax is not supported for strings, and therefore only further encourages the use of symbols instead of strings, even where semantics and performance are not contributing factors.
(2) While the runtime semantics of Symbols will be lost, such as where the type of an object given (Symbol or String) determines the logical flow, the syntactic semantics will remain. Or in other words, Symbols will still be able to convey programmer intent and assist in code readability, for example, where strings are used for Hash keys and values, one may wish to use the symbolic syntax for the keys, and the traditional quoted syntax for the values.
(3) Runtime semantics are of course beneficial, but the cons are an unavoidable consequence. I mention this because I thought for a brief moment, of the possibility of introducing an alternate means of injecting semantics into strings, but I quickly realised that any such thing would take you back to square one where you'd have to be aware of the context in which a string-like object is used. It goes against the duck-typing philosophy of Ruby. If it walks and quacks like a string, why treat Symbols and Strings any differently.
(4) Performance is the other potential benefit of Symbols. It's important to note that the symbolic String syntax can still be handled internally in the same way as Symbols. You could still compare two strings created with the symbolic syntax by their object ID.

By removing the semantic difference between Strings and Symbols, and making Symbols merely an alternate String literal syntax, you eliminate the headache of having to coerce Strings into Symbols, and vice versa. I'm sure we've all written one of those methods that has about 6 #to_sym and #to_s calls. The runtime semantics that are lost by making Symbols and Strings one and the same, can be compensated for in other ways.

I like to think of the case of the #erb method in the Sinatra web framework, where it assumes a symbol is a file path to the markup, and that a string is the actual markup. This method could either be split into two, e.g. #erb for string, and #erbf for path to file. Or, you could include a string prefix or suffix, e.g. #erb './some_file' to indicate it's a file path. Or as my final example, you could simply go with an options hash, e.g. #erb 'some string' for a string, or #erb file: './some_file'. It all depends on the circumstance.

The whole point is to have only one object for strings, and that's the String object. Making Symbol a subclass of String wouldn't solve the problem, though it may make it more bearable.

I'm hoping those who read this consider the suggestion fairly, and don't automatically defend Symbols on merit.


-- 
http://bugs.ruby-lang.org/