> Maybe, but then making a fast parser wouldn't be any fun :)
Since I ran my first preliminary benchmark I have been asking myself
how big the advantage of a C-based parser would actually be. So I
elaborated a little bit on this question. In order to also answer the
question how your solutions "scale", I cleaned up my benchmarks a
little bit. The following includes all submissions that I could make
run with ruby19 -- for whatever reason. I don't have json for ruby18
installed, which is why I didn't run this test with ruby18.
The objects are generated before the test. The tests are run in a
tight loop, the influence of the benchmarking code should thus be
rather marginal.
Objects were generated the JSON representation of which adds up to
about 2MB in 4 different chunk sizes ranging from about 45 to 900
bytes. The object set is identical for all solutions, the numbers are
thus quite comparable. Since the figures differ slightly from Eric
Mahurin's benchmark it's possible that I did something wrong. But in
this case I did it equally wrong for all solutions. The code is down
below.
Regards,
Thomas.
Input chunks:
10: n=43475 avg.size=46.01 tot.size=2000236
20: n=12856 avg.size=155.61 tot.size=2000543
30: n=4897 avg.size=408.51 tot.size=2000483
40: n=2236 avg.size=894.47 tot.size=2000045
Ruby19 json
user system total real
10 2.274000 0.000000 2.274000 ( 2.294000)
20 1.402000 0.000000 1.402000 ( 1.432000)
30 1.041000 0.000000 1.041000 ( 1.061000)
40 1.282000 0.000000 1.282000 ( 1.302000)
10 871942 chars/sec (2000236/2.29)
20 1397027 chars/sec (2000543/1.43)
30 1885469 chars/sec (2000483/1.06)
40 1536132 chars/sec (2000045/1.30)
"solution_tml.rb"
user system total real
10 8.452000 0.010000 8.462000 ( 8.633000)
20 6.570000 0.000000 6.570000 ( 6.599000)
30 6.068000 0.000000 6.068000 ( 6.119000)
40 5.659000 0.000000 5.659000 ( 5.698000)
10 231696 chars/sec (2000236/8.63)
20 303158 chars/sec (2000543/6.60)
30 326929 chars/sec (2000483/6.12)
40 351008 chars/sec (2000045/5.70)
"solution_tml_pb.rb" (modified by P Bonzini)
user system total real
10 8.151000 0.000000 8.151000 ( 8.192000)
20 5.849000 0.000000 5.849000 ( 5.879000)
30 5.307000 0.000000 5.307000 ( 5.337000)
40 5.238000 0.000000 5.238000 ( 5.268000)
10 244169 chars/sec (2000236/8.19)
20 340286 chars/sec (2000543/5.88)
30 374832 chars/sec (2000483/5.34)
40 379659 chars/sec (2000045/5.27)
"solution_eric_i.rb"
user system total real
10158.318000 0.040000 158.358000 (158.798000)
20162.133000 0.030000 162.163000 (162.845000)
30170.305000 0.030000 170.335000 (170.525000)
40193.187000 0.070000 193.257000 (193.458000)
10 12596 chars/sec (2000236/158.80)
20 12284 chars/sec (2000543/162.85)
30 11731 chars/sec (2000483/170.53)
40 10338 chars/sec (2000045/193.46)
"solution_eric_mahurin3.rb"
user system total real
10 7.631000 0.000000 7.631000 ( 7.641000)
20 6.319000 0.000000 6.319000 ( 6.329000)
30 6.179000 0.000000 6.179000 ( 6.179000)
40 5.769000 0.000000 5.769000 ( 5.778000)
10 261776 chars/sec (2000236/7.64)
20 316091 chars/sec (2000543/6.33)
30 323755 chars/sec (2000483/6.18)
40 346148 chars/sec (2000045/5.78)
"solution_james_gray.rb"
user system total real
10 13.820000 0.000000 13.820000 ( 13.890000)
20 12.117000 0.000000 12.117000 ( 12.138000)
30 12.909000 0.000000 12.909000 ( 12.918000)
40 15.051000 0.010000 15.061000 ( 15.082000)
10 144005 chars/sec (2000236/13.89)
20 164816 chars/sec (2000543/12.14)
30 154860 chars/sec (2000483/12.92)
40 132611 chars/sec (2000045/15.08)
"solution_justin_ethier.rb"
user system total real
10 17.025000 0.000000 17.025000 ( 17.025000)
20 17.915000 0.040000 17.955000 ( 17.985000)
30 28.001000 0.021000 28.022000 ( 28.041000)
40 51.253000 0.070000 51.323000 ( 51.394000)
10 117488 chars/sec (2000236/17.03)
20 111233 chars/sec (2000543/17.98)
30 71341 chars/sec (2000483/28.04)
40 38915 chars/sec (2000045/51.39)
"solution_paolo_bonzini.rb"
user system total real
10 11.036000 0.000000 11.036000 ( 11.036000)
20 17.045000 0.030000 17.075000 ( 17.104000)
30 32.717000 0.020000 32.737000 ( 32.857000)
40 69.119000 0.070000 69.189000 ( 69.310000)
10 181246 chars/sec (2000236/11.04)
20 116963 chars/sec (2000543/17.10)
30 60884 chars/sec (2000483/32.86)
40 28856 chars/sec (2000045/69.31)
"solution_steve.rb"
user system total real
10210.152000 0.040000 210.192000 (210.573000)
20215.260000 0.060000 215.320000 (215.590000)
30223.201000 0.110000 223.311000 (228.368000)
40241.257000 0.260000 241.517000 (248.868000)
10 9499 chars/sec (2000236/210.57)
20 9279 chars/sec (2000543/215.59)
30 8759 chars/sec (2000483/228.37)
40 8036 chars/sec (2000045/248.87)
Benchmark code:
require 'benchmark'
# require 'json/pure'
require 'json'
N = 2000
S = [10, 20, 30, 40]
# This is a slightly enhanced version of Ara's object generator.
# Objects are generated via RandomObject.generate(nil, DEPTH)
# -- the first argument defines which object types are eligible
# and can be ignored in this context.
require 'tml/random-object'
puts 'Preparing objects ...'
sizes = Hash.new
objects = S.inject({}) do |h, s|
size = 0
a = h[s] = []
n = N * 1000
while size < n
o = RandomObject.generate(nil, s)
j = o.to_json
a << [o, j]
size += j.size
end
sizes[s] = size.to_f
h
end
throughput = Hash.new {|h, k| h[k] = Hash.new(0)}
ARGV.each do |arg|
p arg
require arg
parser = JSONParser.new
throughput = []
Benchmark.bm do |b|
S.each do |s|
t = b.report(s.to_s) do |sn|
objects[s].each do |o, j|
if o != parser.parse(j)
raise RuntimeError
end
end
end
throughput << "%s %d chars/sec (%d/%0.2f)" % [s,
sizes[s] / t.real, sizes[s], t.real]
end
end
puts
puts throughput.join("\n")
puts
puts
end
objects.each do |s, z|
puts "%s: n=%d avg.size=%0.2f tot.size=%d" %
[s, z.size, sizes[s].to_f / z.size, sizes[s]]
end
puts