Checked latest 1.9 out of svn last week to run this test.

[pbrannan@zem tmp]$ uname -a
Linux zem 2.6.5-7.252-smp #1 SMP Tue Feb 14 11:11:04 UTC 2006 i686 i686 i386 GNU/Linux
[pbrannan@zem tmp]$ ruby -v
ruby 1.8.2 (2004-12-25) [i686-linux]
[pbrannan@zem tmp]$ ruby1.9 -v
ruby 1.9.0 (2007-11-01 patchlevel 0) [i686-linux]

[pbrannan@zem tmp]$ cat run.sh
#!/bin/sh

n=$1

echo -n "$n "

ruby generate.rb $n | ruby test.rb

# Silly that I have to grep out the version number when I run ruby...
ruby1.9 generate.rb $n | ruby1.9 test.rb | grep -v patchlevel

[pbrannan@zem tmp]$ for i in 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 200000 300000 400000 500000; do ./run.sh $i; done
1000 0.06188 0.050094 
2000 0.122787 0.102367 
3000 0.183689 0.154389 
4000 0.249985 0.205219 
5000 0.30916 0.258219 
6000 0.374419 0.313267 
7000 0.430542 0.368606 
8000 0.500378 0.416648 
9000 0.563266 0.472854 
10000 0.615718 0.530482 
20000 1.238703 1.112686 
30000 1.887975 1.723124 
40000 2.502101 2.387833 
50000 3.127626 3.185816 
60000 3.762366 3.864363 
70000 4.347518 4.714041 
80000 5.070685 5.571421 
90000 5.631676 6.398777 
100000 6.289828 7.407697 
200000 12.559129 19.146406 
300000 18.802112 35.404611 
400000 25.423781 55.181376 
500000 31.463272 80.206863 

[pbrannan@zem tmp]$ cat generate.rb
n = ARGV[0].to_i

n.times do
  puts "0=1,1=2,2=3,3=4,4=5,5=6,6=7,7=9,8=9,9=10,10=0"
end

[pbrannan@zem tmp]$ cat test.rb
def foo(line)
  h = Hash.new
  if(line.nil?)
    return h;
  end
  line.chomp!
  line.each_line(',') do |field|
    field.chomp!(',')
    a = field.split('=', 2)
    h.store(a[0].to_i, a[1])
  end
  return h
end

t = Time.now
while line = $stdin.gets do
  h = foo(line)
end
print "#{(Time.now - t).to_f} "


Clearly there is a point where 1.9 starts to get really slow in this
test compared to 1.8.  Koichi suggested it might be related to M17N, so
I tried running 1.9 with -Ka and got the same results.  I checked the
build options to make sure optimizations weren't turned off, and 1.9 was
in fact built with -O2.

If this were caused by M17N, I would expect that the ratio of the
execution time in 1.8 to 1.9 would remain constant as the number of
iterations is increased.  Instead the ratio seems to grow exponentially.
I wonder if 1.9 is perhaps creating more objects or has non-linear
increase in overhead as the number of objects is increased?

What other tests might I run to track this down?

Paul