Since I've had a few private e-mails sent to me on this I thought I would
just post some extra information.
----
In case it was not clear, the timing numbers I gave for were for a complete
run of the script which included
(startup+compilation+execution+shutdown+exit) of the VM.

The run did not use a "hint" to suggest what the size was.

The entire Hash (aka <Dictionary>) class is written in SmallScript (there
are no VM/C++ primitives).

The timing for just executing the snippet only (not counting
startup/compilation/shutdown) where a hint is provided is:

"Run using standard #to:do: message (which gets inlined)"

    0.570s to run snippet with hint

    stdout cr <<
    [
        |data. a := Dictionary(1000001).|
        0 to: 1000000 do: [:i| a[i] := 'joe'].
        0 to: 1000000 do: [:i| data := a[i]].
    ] millisecondsToRun.


"OR -- using alternate syntax with an <Interval> (not inlined)"

    0.656s to run snippet with hint

    stdout cr <<
    [
        |data. a := Dictionary(1000001).|
        for (each i in 0 to: 1000000)
            a[i] := 'joe'].
        for (each i in 0 to: 1000000)
            data := a[i].
    ] millisecondsToRun.

The cpu is a 1.2GHz AMD TBird.

--
-- Dave S. [ http://www.smallscript.net ]

"David Simmons" <pulsar / qks.com> wrote in message
news:wTD17.159652$%i7.106431402 / news1.rdc1.sfba.home.com...
> "Joseph McDonald" <joe / vpop.net> wrote in message
> news:eTp17.43890$AM.956429 / e420r-sjo3.usenetserver.com...
> >
> > Hi,
> >
> > Just wondering if there is a trick to speed up the following:
> >
> > % cat junk.rb
> > a = Hash.new
> > for i in 0..1000000
> >   a[i] = "joe"
> > end
> >
> > for i in 0..1000000
> >   data = a[i]
> > end
> >
> > /usr/bin/time ./junk.rb
> > 34.55user 0.19system 0:34.73elapsed 100%CPU (0avgtext+0avgdata
> 0maxresident)k
> > 0inputs+0outputs (232major+11360minor)pagefaults 0swaps
> >
> >
> > If I do a GC.disable it gets much better:
> >
> > /usr/bin/time ./junk.rb
> > 5.02user 0.13system 0:05.14elapsed 100%CPU (0avgtext+0avgdata
> 0maxresident)k
> > 0inputs+0outputs (228major+12371minor)pagefaults 0swaps
> >
> > I have tried the above with both 1.6.4 and 1.7.1 with generational
> > GC patch, results are about the same.
> >
> > Should we just make sure that we turn off GC when doing something like
> > above?
> >
> > python is about 4 seconds for the same thing and perl is 5.1 seconds.
>
> Hmm. I'm not familiar with Ruby's Hash class implementation.
>
> However, in a language like Smalltalk one can provide a "hint" as to the
> number of elements the hash (aka <Dictionary>) object will contain. This
> allows it to pre-allocate enough space, so it does not have to keep
getting
> grown and rehashed. Second, growth is typically performed in powers of two
> rounded up to the nearest prime number (to improve the likelyhood of a
good
> hashing distribution).
>
> A given hashed collection object (or a subclass) may be customized to
change
> the default resize (add/remove) growth/shrink policy characteristics or
> algorithm.
>
> For comparison, when I ran the above snippet in SmallScript (QKS
Smalltalk),
> it took:
>
>      0.556 seconds to build the hash.
>
>      0.869 seconds to both build the hash and then query it (which is
>                    what the provided snippet appears to do).
>
> SmallScript GC is fully enabled.
>
> I don't know what machine the original poster was using for the Ruby run.
>
> My machine used for this run is a 1.2GHz AMD TBird. Assuming that the
> original poster was running on a 500MHz machine then a comparable
> SmallScript run would take about 2-seconds.
>
> --
> -- Dave S. [ http://www.smallscript.net ]
>
> >
> > thanks,
> > -joe
> >
> >
>
>
>
>
>
>