Issue #8992 has been updated by sam.saffron (Sam Saffron).


@hedius

There are 3 things being discussed here, I think it is fairly important we split them out.

1. Parser optimisation for "string".freeze
2. Unconditionally have #freeze return a pooled string
3. Change the semantics of #freeze so it amends the current object and operates like .NET / Java intern does. 


1) is completely doable with little side-effects. My caveat is that if #1 is the only thing done, the semantics for #freeze depend on the invocation. That said, this is minor. I totally accept that and prefer "string".freeze to "string"f. 

2) without 3) really scares me. 

Imagine the odd semantics:

a = "hello"
a.freeze # freezes one RVALUE in memory and returns a different RVALUE


As to 3) I don't think it can be implemented in MRI. If an RVALUE is moved in memory, MRI is going to have to crawl the heap and rewrite all the RVALUE that hold a ref to it, it does not keep track of this internally.

@charliesome thoughts? 
----------------------------------------
Feature #8992: Use String#freeze and compiler tricks to replace "str"f suffix
https://bugs.ruby-lang.org/issues/8992#change-42440

Author: headius (Charles Nutter)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: current: 2.1.0


BACKGROUND:

In https://bugs.ruby-lang.org/issues/8579 @charliesome introduced the "f" suffix for creating already-frozen strings. A string like "str"f would have the following characteristics:

* It would be frozen before the expression returned
* It would be the same object everywhere, pulling from a global "fstring" table

To avoid memory leaks, these pooled strings would remove themselves from the "fstring" table on GC.

However, there are problems with this new syntax:

* It will never parse in Ruby 2.0 and earlier.
* It's not particularly attractive, though this is a subjective matter.
* It does not lend itself well to use in other scenarios, such as for arrays and hashes (http://bugs.ruby-lang.org/issues/8909 )

PROPOSAL:

I propose that we eliminate the new "f" suffix and just make the compiler smart enough to see literal strings with .frozen the same way.

So this code:

str = "mystring".freeze

Would be equivalent in the compiler to this code:

str = "mystring"f

And the fstring table would still be used to return pooled instances.

IMPLEMENTATION NOTES:

The fstring table already exists on master and would be used for these pooled strings. An open question is whether the compiler should forever optimize "str".frozen to return the pooled version or whether it should check (inline-cache style) whether String#freeze has been replaced. I am ok with either, but the best potential comes from ignoring String#freeze redefinitions...or making it impossible to redefine String#freeze.

BONUS BIKESHEDDING:

If we do not want to overload the existing .freeze method in this way, we could follow suggestions in http://bugs.ruby-lang.org/issues/8977 to add a new "frozen" method (or some other name) that the compiler would understand.

If it were "frozen", the following two lines would be equivalent:

str = "mystring".frozen
str = "mystring"f

In addition, using .frozen on any string would put it in the fstring table and return that pooled version.

I also propose one alternative method name: the unary ~ operator.

There is no ~ on String right now, and it has no meaning for strings that we'd be overriding. So the following two lines would be equivalent:

str = ~"mystring"
str = "mystring"f

JUSTIFICATION:

Making the compiler aware of normal method-based String freezing has the following advantages:

* It will parse in all versions of Ruby.
* It will be equivalent in all versions of Ruby other than the fstring pooling.
* It extends neatly to Array and Hash; the compiler can see Array or Hash with literal elements and return the same object.
* It does not require a pragma (http://bugs.ruby-lang.org/issues/8976 )
* It looks like Ruby.


-- 
http://bugs.ruby-lang.org/