Issue #5976 has been reported by Masaki Matsushita.

----------------------------------------
Feature #5976: abolition of MD5 calculation in pstore.rb
https://bugs.ruby-lang.org/issues/5976

Author: Masaki Matsushita
Status: Open
Priority: Normal
Assignee: 
Category: lib
Target version: 


=begin
I suggest abolition of MD5 calculation in pstore.rb.
The present pstore.rb reads the whole detabase file to calculate MD5 digest, and
reads the whole one to Marshal.load again.

I think MD5 calculation in pstore.rb is for avoidance of writing same data and perforance improvement.
However, I noticed it doesn't result in success.

Therefore, I wrote the patch to abolish MD5 calculation and ran the following(from #5248#note-8, [ruby-core:41747]).

 require 'pstore'
 require 'benchmark'
 
 def run(size)
   file = "pstore_#{size}"
   File.unlink(file) if File.exist?(file)
   store = PStore.new(file)
   store.transaction do
     store["hoge"] = "hoge" * size
   end
   1000.times do
     store.transaction do
       store["hoge"] += "hoge"
     end
   end
 end
 
 Benchmark.bm(6) do |bm|
   [1000, 10000, 100000].each do |size|
     bm.report(size.to_s) { run(size) }
   end
 end

results:

present pstore.rb(r34083):

              user     system      total        real
 1000     0.140000   0.060000   0.200000 (  0.203312)
 10000    0.300000   0.060000   0.360000 (  0.355922)
 100000   1.710000   0.380000   2.090000 (  2.097768)

proposed pstore.rb:

              user     system      total        real
 1000     0.130000   0.030000   0.160000 (  0.157974)
 10000    0.170000   0.050000   0.220000 (  0.223475)
 100000   0.690000   0.220000   0.910000 (  0.911585)

Proposed pstore.rb is faster.
I also made the case which is more disadvantageous for proposed pstore.rb.
In the case, present pstore.rb doesn't write data bacause the data is same and MD5 calculation works.

 require 'pstore'
 require 'benchmark' 
 
 def run(size)
   file = "pstore_#{size}"
   File.unlink(file) if File.exist?(file)
   store = PStore.new(file)
   1000.times do
     store.transaction do
       store["hoge"] = "hoge" * 100
     end
   end
 end
 
 Benchmark.bm(6) do |bm|
   [1000, 10000, 100000].each do |size|
     bm.report(size.to_s) { run(size) }
   end
 end


results:

present pstore.rb(r34083):

              user     system      total        real
 1000     0.180000   0.030000   0.210000 (  0.219204)
 10000    0.120000   0.050000   0.170000 (  0.169018)
 100000   0.120000   0.040000   0.160000 (  0.159369)

proposed pstore.rb:

              user     system      total        real
 1000     0.130000   0.030000   0.160000 (  0.162533)
 10000    0.110000   0.020000   0.130000 (  0.126099)
 100000   0.110000   0.010000   0.120000 (  0.122751)

Proposed pstore.rb is not slower even in disadvantage.
These benchmark shows abolition of MD5 calculation improve performance.
Proposed pstore.rb passes test_pstore.rb.
=end


-- 
http://bugs.ruby-lang.org/