Aleksi Niemel<aleksi.niemela / cinnober.com> writes:

> I think the major problem is that I'm creating tons of objects, and most of
> the time is spent on garbage collection. Actually the real data does not got
> lost at any point, there's just probably too many inobjects before final
> form.

I changed your code slightly, and you're correct-the performance is
linear in terms of the number of objects present in the system.

                   Count   Time/1000   #objs time/obj(*1e6)
     16:14:53.426   1000:  0.8981      34645 25.923943
     16:14:55.102   2000:  1.6759      69350 24.166229
     16:14:57.595   3000:  2.4932     102927 24.223352
     16:15:00.932   4000:  3.3370     137932 24.193392
     16:15:04.961   5000:  4.0282     171525 23.484781
     16:15:09.798   6000:  4.8378     206498 23.427699
     16:15:15.369   7000:  5.5704     240117 23.198791
     16:15:21.681   8000:  6.3117     274941 22.956460
     16:15:28.824   9000:  7.1438     308884 23.127763

These numbers are object right-each time around the loop you create 34
objects (15 key/value pairs and a hash for 'entry', and an array and
two strings for 'key').

If you disable garbage collection, then the times become linear.

     16:18:32.193   1000:  0.4336
     16:18:32.626   2000:  0.4324
     16:18:33.070   3000:  0.4445
     16:18:33.506   4000:  0.4360
     16:18:33.942   5000:  0.4359
     16:18:34.377   6000:  0.4347
     16:18:34.812   7000:  0.4355
     16:18:35.256   8000:  0.4438
     16:18:35.691   9000:  0.4352


So, it looks like it is the overhead of marking and sweeping all those 
thousands of objects that is causing the problem.

Matz?



Dave