"Sean O'Dell" <sean / celsoft.com> writes:
> "Yohanes Santoso" <ysantoso / jenny-gnome.dyndns.org> wrote in message
> stack-based memory management, or even reference counting.

reference counting has this tendency of not being able to perform
complete cleanup if the objects has circular dependency.

> memory available.  When that happens, I want it to shrink back up
> immediately after the process is finished.  But I don't want to limit my
                        ^^^^^^^
The OS will automatically clean up any finished process. Did you mean
function? If so, then you're describing the behaviour of stack-based
MM language like Perl. Even in Perl, there is a need to regularly
restart the process to keep memory requirement to a minimum because
Perl cannot free up memory allocated in the main function (you leave
the main function iff you stop the process).

> Ruby applications...if they need memory, they need memory.

Your case made me look at the ruby's gc implementation. There does not
seem to be any deallocating mechanism for the allocated heap. So
memory always grow but will never shrink. Certainly this is a problem
in your case since you want a long-running process. However, the good
thing is your OS has a memory manager (MM) (most probably). Unused heap
will be put on the swap disk instead of hogging precious RAM.

from ruby-1.6.6's gc.c:
VALUE rb_newobj() {
    VALUE obj;
    if (!freelist) rb_gc();
    /* some other stuff. */
}

Think of ruby's MM as a memory pooling. Ruby allocates mem in chunks
for efficiency. Everytime you need more memory than available, ruby
performs GC. Many times there's no need to grab more memory from the
OS because there should be lots of junks that can be discarded. Thus,
memory consumption is kept to minimum. 

Of course, GC is not perfect. For example if you load a huge file to
memory, the memory footprint of your process will be huge too even
after you don't need the file anymore. In this case, GC is helped by
the OS' MM. Those useless heap will be stored on your swap disk
instead of hogging precious RAM. How smart the OS in detecting
unneeded memory depends on which OS you use, and is a great subject
for OS-holy-war.

Another instance where GC suffers is when you have lots and lots of
inter-dependent temporary objects (as in the case of JLex).

> I know I can call the garbage collector...but that requires that my code
> recognizes when things are getting heavy.  That means *I* am managing
> memory.  I would rather that each method be able to exit completely cleanly,

As mentioned above, you don't need to manage memory. The decision
whether to call GC or not is made everytime you allocate new object.

> dies, so this won't work.  A true destruction mechanism is called in the
> context of the healthy object, allowing it to perform exit tasks using
> resources it may have allocated or references to other objects/variables.

Actually that 'true' destruction mechanism you described is used only
in reference-counting MM. In a true GC environment, everything is
considered in GC-ing. 

> The thing is, I need my programs to shrink up and be as small as possible at
> all times.  I am still trying to grasp how I do that.  With C++, Delphi,

If you mean in terms of virtual memory size, then Ruby cannot do
that (yet?). But in terms of physical memory size, the GC and the OS'
MM make sure that it stays as small as possible.

> Here's an example of what I'm afraid of.  A Ruby process starts running and
> grows extremely large with unused objects, but it doesn't completely hog the
> whole system so garbage collection isn't triggered by a memory failure.  

That can happen, but GC is trigerred everytime Ruby runs out of heap. 

> A few MB's of RAM are left.  

Ruby relies on the OS to perform its MM thingie correctly.

> Then my web server suddenly needs a large amount
> of memory, but there's none available so it fails miserably.  

If the OS does its MM thingie correctly, then the only time the web
server fails miserably is when there really is not enough physical
memory.

> So, my web process dies because Ruby has dead objects floating around in
> memory; that's what that scenario boils down to.

Here is my 'ps' output:
root     22987  0.0  0.1 71296  880 ?        S    Jan30   0:07 apache
ysantoso  8781  0.0  3.7 24184 19076 ?       S    Feb04  21:17 emacs21
ysantoso 10152  0.0  0.4 52728 2276 pts/6    S    Feb15   2:02 mozilla
freenet  31892  0.0  2.1 148136 11048 ?      SN   Jan30   0:01 freenet

apache uses 70MB of virtual memory, but only 0.8MB of physical memory
and it's been running since Jan 30. Apache uses a grow-only memory
pooling (if I remember correctly), much like Ruby's, but that does not
affect the OS because it is smart enough not to load the other 68.2MB
to memory.

emacs's lisp engine has a gc too. mozilla uses a surprisingly small
amount of physical memory: 2MB. I think NN4 uses at least 10MB of
physical memory.

What's closely relevant to your case is freenet. It runs on JVM and it
is a long-running process. The JVM's GC is quite similar to Ruby's
GC. So, if you have a long-running Ruby process, then its memory
consumption is probably similar to freenet: 144MB virtual memory and
only 10MB of physical ram. Maybe freenet grew to 144MB because it
deals with file-transfer (i.e.: perhaps it receives data to the memory
before writing it to disk). My total physical memory is only 96MB but
everything works fine. No thrasing because the OS is doing its MM
thingie properly.

So, actually, it seems the use of grow-only memory pooling is quite
common, but you don't notice it because the OS makes it painless.

YS.