Issue #8977 has been updated by phluid61 (Matthew Kerwin).


headius (Charles Nutter) wrote:
> Actually, I'm getting pretty down on having the fstring cache at all. It seems like if we want a string pool, it should be via a library. Adding something into Ruby that pools strings for you just seems like asking for trouble, either due to GC overhead (cleaning up that hash for tons of transient frozen strings) and semantics (abuse of #frozen or #freeze to do pooling implicitly).

Is it any worse than the fact that String#intern returns a Symbol?  IIRC this whole effort started because people were using Symbols as interned Strings (in the Java sense), but Symbols can't be GC'ed, so there were memory leak-type issues.  If we're viewing the fstring cache as an effort to allow GC'ing of Symbols (effectively, though not in name) then it seems the issues and complexities are a given.

I agree that we should make #freeze use the pool.  If people really, really want to have a version that returns the same object (frozen), we could introduce String#freeze!

- rb_define_method(rb_cString, "freeze", rb_obj_freeze, 0);
+ rb_define_method(rb_cString, "freeze", rb_fstring, 0);
+ rb_define_method(rb_cString, "freeze!", rb_obj_freeze, 0);

This is based on my (possibly flawed) understanding that Ruby seems willing to make not-backwards-compatible changes between minor versions (1.8 -> 1.9), even if not between majors (1.9.3 -> 2.0).  The benefits of having a pooled #freeze seem to outweigh the risk of someone depending on it returning the same object, especially if that person has an upgrade path to get their old functionality back.
----------------------------------------
Feature #8977: String#frozen that takes advantage of the deduping 
https://bugs.ruby-lang.org/issues/8977#change-43530

Author: sam.saffron (Sam Saffron)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: 
Target version: current: 2.1.0


During memory profiling I noticed that a large amount of string duplication is generated from non pre-determined strings.

Take this report for example https://gist.github.com/SamSaffron/6789005 (generated using the memory_profiler gem that works against head) 

">=" x 4953
    /Users/sam/.rbenv/versions/2.1.0-dev/lib/ruby/2.1.0/rubygems/requirement.rb:93 x 4535

This string is most likely extracted from a version. 

Or 

"/Users/sam/.rbenv/versions/2.1.0-dev/lib/ruby/gems" x 5808
    /Users/sam/.rbenv/versions/2.1.0-dev/lib/ruby/gems/2.1.0/gems/activesupport-3.2.12/lib/active_support/dependencies.rb:251 x 3894

A string that can not be pre-determined. 

---- 


It would be nice to have 

"hello,world".split(",")[0].frozen.object_id == "hello"f.object_id 

Adding #frozen will give library builders a way of using the de-duping. It also could be implemented using weak refs in 2.0 and stubbed with a .dup.freeze in 1.9.3 . 

Thoughts ?  





-- 
http://bugs.ruby-lang.org/