Issue #16150 has been updated by headius (Charles Nutter).


> But granted that this now cause this somewhat weird situation where to_s might or might not return a frozen string. that being said it's already kinda the case

Yeah, this is precisely the problem. It is made even worse by the large number of libraries that already set `frozen-string-literal`.

One thing we should all be able to agree on at this point: it is not safe to blindly mutate the result of `String#to_s`, since more and more cases will produce frozen strings. With the experimental changes accepted above, an additional set of `to_s` results will also be frozen. So, by extension, it is not now and will never be safe to blindly mutate the result of calling any `to_s`.

I'd argue it's actually NEVER safe to modify the result of calling `to_s`, because for mutable source strings you'd be mutating the original contents! This is probably not desired behavior in the large majority of existing code. With that condition in mind, it would actually be much *safer* for us to start returning frozen strings from String#to_s as soon as possible.

I know this is a difficult pill to swallow. The original purpose of this issue was to consider adding a way to *request* frozen strings from to_s, since that would at least be an opt-in. Hence the suggestions of a to_z or similar new mechanism.

In the 2.7 timeframe, I'd like to see some combination of the following:

* A way to explicitly request a frozen string from any object, such as `to_z`, so code could start migrating toward frozen strings today.

This would be an extension of existing `"str".freeze` and `-"str"` logic for explicitly requesting a single frozen string literal, but would allow making this request for any `to_s` result.

* A debug mode that would warn if code attempts to mutate the result of `String#to_s` (e.g. `--debug:frozen-string-to_s`), since based on the above conditions this is almost never advisable.

This will help us audit existing code and start "fixing" it right now without a hard break. I'd like to see the above warning all the time, but that would require tracking source file and line for all literal strings (as in the current `--debug:frozen-string-literal`).

* A pragma to explicitly set a file as having mutable-string-literal, as a local escape hatch for future default frozen-string-literal.
* A command-line flag to force mutable-string-literal when no pragma exists, as a global escape hatch for future default frozen-string-literal.

These are a simple way to guarantee a soft landing if (when?) we default to `frozen-string-literal` globally in 3.0. All cases that would break with default `frozen-string-literal` could at least be made to work with the flag.

----------------------------------------
Feature #16150: Add a way to request a frozen string from to_s
https://bugs.ruby-lang.org/issues/16150#change-81756

* Author: headius (Charles Nutter)
* Status: Assigned
* Priority: Normal
* Assignee: Eregon (Benoit Daloze)
* Target version: 
----------------------------------------
Much of the time when a user calls to_s, they are just looking for a simple string representation to display or to interpolate into another string. In my brief exploration, the result of to_s is rarely mutated directly.

It seems that we could save a lot of objects by providing a way to explicitly request a *frozen* string.

For purposes of discussion I will call this to_frozen_string, which is a terrible name.

This would reduce string allocations dramatically when applied to many common to_s calls:

* Symbol#to_frozen_string could always return the same cached String representation. This method is *heavily* used by almost all Ruby code that intermingles Symbols and Strings.
* nil, true, false, and any other singleton values in the system could similarly cache and return the same String object.
* The strings coming from core types could also be in the fstring cache and deduplicated as a result.
* User-provided to_s implementations could opt-in to caching and returning the same frozen String object when the author knows that the result will always be the same.

A few ideas for what to call this:

* `to_fstring` or `fstring` reflects internal the "fstring" cache but is perhaps not obvious for most users.
* `to_s(frozen: true)` is clean but there will be many cases when the kwargs hash doesn't get eliminated, making matters worse.
* `def to_s(frozen = false)` would be mostly free but may not be compatible with existing to_s params (like `Integer#to_s(radix)`

This idea was inspired by @schneems's talk at RubyConf Thailand, where he showed significant overhead in ActiveRecord from Symbol#to_s allocation.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>