-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 04 Aug 2001 08:01:33 GMT, Harry Ohlsen <harryo / zip.com.au> wrote:
>> Oh, okay. I've noticed an interesting thing about [] though -- it is
>> almost as slow as doing a " % 2" on the number!
>
>Fortunately, I'll only ever be playing with relatively small numbers, so 
>the speed isn't an issue for me.  However, it sounds like there's an 
>opportunity for a cleanup on the [] method for BigNum.
>
>Strangely, I just tried writing a tiny two line routine to get the parity 
>and it performs only slightly better than using Ruby's & and is still not 
>as fast as [], which I find bizarre, since it's a cut-down version of the 
>code for [] !!
>
>Oh well. Someone's bound to come back and tell you how to make things 
>faster.

I spent more time thinking on the subject last night, and I think that
what I am running into is memory allocation overhead. [] itself runs
in constant time regardless of the size of the integer, a 128-bit
integer runs [] in the same amount of time as a 2048-bit integer, so
it appears that [] itself is not doing something stupidly inefficient.
30,000 >> calls ought to run faster than 6,000 % calls when we're
talking about 1024-bit numbers when we are talking about the two
obvious implementations for gcd (Stein's binary gcd
algo. vs. Euclidean gcd algo.), but the 6,000 % calls actually run
close to 4 times faster. Clearly ruby is being swamped by something
other than the cost of the operation itself. My suspicion is that it is
memory allocation overhead. Hmm...

ruby -r profile -e 'require "modexp.rb"' \
     -e "a=128.random ; 30000.times { a.clone } "
 time   seconds   seconds    calls  ms/call  ms/call  name
 53.23     2.14      2.14        2  1070.00  2010.00  Fixnum#times
 46.77     4.02      1.88    30001     0.06     0.06  Numeric#clone
....
Yep, it appears that creating and destroying bigints is a very expensive
operation in Ruby. 

[eric@localhost ruby]$ ruby -r profile -e 'require "modexp.rb"' \
  -e "a=128.random ; 30000.times { a % 5 } "
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 69.98     3.17      3.17        2  1585.00  2265.00  Fixnum#times

shows that well. '%' on a 1024-bit number should take significant
time, but it is clear that it is being swamped (lost in the noise)
by the overhead of creating and destroying. The actual % took only 250ms,
the rest of that is Ruby overhead for creating and destroying the bignums.

[eric@localhost ruby]$ ruby -r profile -e 'require "modexp.rb"'\ 
 -e "a=128.random ; 30000.times { a[0] } "
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 61.37     2.51      2.51        2  1255.00  2040.00  Fixnum#times
 37.65     4.05      1.54    30000     0.05     0.05  Bignum#[]

shows that [] itself is very fast, requiring only 40ms or so to
execute 30,000 times, but again is being swamped by the overhead of
creating and destroying objects. But [] is creating an integer, not
a Bignum :-(. 

[eric@localhost ruby]$ ruby -r profile -e 'require "modexp.rb"' \
   -e "a=128.random ; 30000.times { 5+1 } "
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 78.83     3.24      3.24        2  1620.00  2050.00  Fixnum#times
 20.19     4.07      0.83    30000     0.03     0.03  Fixnum#+

This proves that the problem is the object creator/destructor, not the 
cost of the mathematical operations. So it appears that for 
anything that requires doing many operations, such as finding the greatest
common denominator using the Stein binary gcd algorithm, I will need to
move it into "C" entirely rather than implementing it in Ruby. 

This is nothing very new. I ran into this when I wrote the twofish-py
module for Python. Even allocating my blocks (string objects) 8K at a
time and encrypting blocks in 'C', Python's memory allocator limited
my performance to a max of 2mb/sec. When I re-wrote the stream
encrypter into "C" (using same low-level "C" routines, just finished
moving the block slinger into "C"), I managed over 100mb/sec.  I had
hoped that Ruby's memory allocator was significantly faster, but
perhaps I am running into limitations of the GNU 'libc' library, which
is not known for its excellent speed.

It appears that investigation of the object creator is warranted to
see if there is a way to speed up creation of objects :-(. 

Eric Lee Green         Web: http://www.badtux.org 
  GnuPG public key at http://badtux.org/eric/eric.gpg
    Free Dmitry Sklyarov! [ http://www.eff.org ]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7bB9L3DrrK1kMA04RAn/VAJ4uOJm82vcKOS60wAtULFmkV/AOrwCaAvrQ
mICp8QY+YkxlklxvzrkB50c=
=NY6T
-----END PGP SIGNATURE-----