ako... wrote:
> thanks, i did not know that one can count objects using ObjectSpace.
>
> it looks like map+join use fewer objects than inject. even though map
> creates a whole new array. i would expect inject to use fewer objects.
> strange...
>
> konstantin
>
> Erik Veenstra wrote:
> > ----------------------------------------------------------------
> >
> >  $ vi test.rb ; cat test.rb
> >  GC.disable
> >
> >  1000.times do
> >    (1..1000).map{|s| s.to_s}.join("")
> >    #(1..1000).inject(""){|r, s| s.to_s; r}
> >  end
> >
> >  count   = 0
> >  ObjectSpace.each_object do
> >    count += 1
> >  end
> >  p count
> >
> >  $ ruby test.rb
> >  1003392
> >
> > ----------------------------------------------------------------
> >
> >  $ vi test.rb ; cat test.rb
> >  GC.disable
> >
> >  1000.times do
> >    #(1..1000).map{|s| s.to_s}.join("")
> >    (1..1000).inject(""){|r, s| s.to_s; r}
> >  end
> >
> >  count   = 0
> >  ObjectSpace.each_object do
> >    count += 1
> >  end
> >  p count
> >
> >  $ ruby test.rb
> >  2001392
> >
> > ----------------------------------------------------------------
> >
> > gegroet,
> > Erik V. - http://www.erikveen.dds.nl/

It's true that map appears to create fewer objects, which may have some
advantages, but the situation isn't entirely one-sided.  Now consider
the case where the result of each function consumes a whole megabyte of
memory.  Using map and join, you get 1000 megabyte-long strings in an
array, so none of them can be freed.  You're already a gig into your
memory before you join them, resulting in another gig used.

If, as in the example, you have garbage collection turned off, you'll
be 2 gigs (plus overhead) into your memory+swap in either scenario.
However, with the garbage collector enabled and using inject, the
results of each iteration could be freed, keeping your maximum memory
demand lower.

Of course, the complete answer would have to factor in the
implementations of the string concatenations and of the join method,
the number of memory reallocations needed by each, and the degree of
heap fragmentation incurred.

Personally, I would use some kind of memory or file stream with
inject--so that not only can the intermediate result objects be freed
but also so memory reallocations can be minimized.

In the end, Erik's approach, using metrics rather than speculation,
will be most effective.

Brent Rowland