--nextPart19891974.JPQ7Y9L6Ra
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Sunday 20 November 2005 04:38, Nicolas Cannasse wrote:
> Looking at Ruby implementation, every object needs to allocate a fairly
> big hashtable (11 buckets by default) in order to get a real O(1) access
> in practice - which can be O(n) in the worst case - and that cost (a)
> memory and (b) GC cycles when scanning it.
>
> When Neko for example is running into Apache, that's several hundreds of
> process having all theses living objects so in that case you're more
> often memory-bound than CPU-bound, it then makes sense to trade CPU for
> memory.

I have to agree that there are more considerations than just access speed. =
=20
I'm running into this in REXML, where people are complaining that loading a=
ny=20
document of a couple of megs will exhaust their memory, and even 1GB system=
s=20
choke on documents greater than 5MB[1].  Part of the problem (perhaps the=20
dominant problem) is that REXML Element nodes use hashtables for Attribute=
=20
lists; so every Element has a hashtable[2].

A program being too slow is certainly a problem, but I'd argue that it isn'=
t=20
as big of a problem as having the program run your machine out of memory an=
d=20
then fail terribly.

Anyway, this doesn't meaningfully contribute to the discussion, but the thr=
ead=20
is apropos to a problem I'm struggling with at the moment.



[1] This isn't a Ruby problem, of course.  This is programmer error, in=20
failing to anticipate the large data sets.  My XML problem domains are alwa=
ys=20
small, so, of course, are everyone else's :-)

[2] The problem in REXML is actually worse; I use Ruby hashes all over the=
=20
place, with little regard, and now I'm having to go back and strip them out=
=20
and replace them with balanced trees.  One thing I'm anticipating is having=
=20
REXML choose between hash and red-black trees (or whatever I settle on)=20
intelligently if it can determine the size of the document, and use trees i=
f=20
it can't.  However, it is more critical to make sure that REXML doesn't cho=
ke=20
on large data sets.

=2D-=20
=2D-- SER

"As democracy is perfected, the office of president represents,=20
more and more closely, the inner soul of the people.  On some=20
great and glorious day the plain folks of the land will reach=20
their heart's desire at last and the White House will be adorned=20
by a downright moron."        -  H.L. Mencken (1880 - 1956)

--nextPart19891974.JPQ7Y9L6Ra
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2-ecc0.1.6 (GNU/Linux)

iD8DBQBDgI69P0KxygnleI8RAjQLAJ4umyFt5+NamCeeAwtA6TxYlhzMbACgyBZJ
xNg5UMJCcxdKIGqenYL5K6o=
=7zVh
-----END PGP SIGNATURE-----

--nextPart19891974.JPQ7Y9L6Ra--