Issue #8977 has been updated by tmm1 (Aman Gupta).


@ko1 and I discussed this at length earlier.

Although a frozen string table could be implemented in ruby (with the help of C-ext like WeakHash above), the current implementation of the finalizer table adds overhead that would make it unsuitable for long-lived strings. In particular, each finalizer currently requires 2 extra object VALUE slots and the finalizer_table is marked on every minor mark.

The main concern with exposing the fstr table to ruby is that it could be easily be misused. Feeding a large number of entries into this table would slow down lookup times and subsequent calls to rb_fstring(). Currently rb_fstring() calls are isolated to boot-up and compile time, so runtime performance is not a factor. Misuse from ruby-land could increase the number of frozen_strings hash lookups, possibly introducing performance or security concerns.

As as alternative, we discussed including a C-only api for 2.1. This would limit possible misuse/abuse, yet still allow for responsible use of de-duplication features present in 2.1. We should include the size of the frozen_strings table to encourage monitoring and size caps. This API might look something like the following (naming suggestions welcome):

  VALUE rb_str_frozen_dedup(VALUE str)
  size_t rb_str_frozen_table_size()

----------------------------------------
Feature #8977: String#frozen that takes advantage of the deduping 
https://bugs.ruby-lang.org/issues/8977#change-43545

Author: sam.saffron (Sam Saffron)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: 
Target version: current: 2.1.0


During memory profiling I noticed that a large amount of string duplication is generated from non pre-determined strings.

Take this report for example https://gist.github.com/SamSaffron/6789005 (generated using the memory_profiler gem that works against head) 

">=" x 4953
    /Users/sam/.rbenv/versions/2.1.0-dev/lib/ruby/2.1.0/rubygems/requirement.rb:93 x 4535

This string is most likely extracted from a version. 

Or 

"/Users/sam/.rbenv/versions/2.1.0-dev/lib/ruby/gems" x 5808
    /Users/sam/.rbenv/versions/2.1.0-dev/lib/ruby/gems/2.1.0/gems/activesupport-3.2.12/lib/active_support/dependencies.rb:251 x 3894

A string that can not be pre-determined. 

---- 


It would be nice to have 

"hello,world".split(",")[0].frozen.object_id == "hello"f.object_id 

Adding #frozen will give library builders a way of using the de-duping. It also could be implemented using weak refs in 2.0 and stubbed with a .dup.freeze in 1.9.3 . 

Thoughts ?  





-- 
http://bugs.ruby-lang.org/