"Yohanes Santoso" <ysantoso / jenny-gnome.dyndns.org> wrote in message
news:8DC67533CBF3F480.E9A6522E91258840.45C12F3FF7DE783D / lp.airnews.net...
> "Sean O'Dell" <sean / celsoft.com> writes:
>
> > Isn't there a way we can *explicitly* destroy objects?  I like to keep
> > things tight and clean in my code,
>
> Why? Depending on what you do and how you do it, GC can be much faster
> than explicit free(). Later on this.

Because I like controlling when my program cleans itself up.

> > and simply walking away from objects I've
> > created absolutely freaks me out.
>
> It seems that Ruby is your first GC'ed language. Make the
> transition. Let the computer does mudane things like freeing memory
> and liberate yourself to pursue other more interesting things (like
> going fishing more often).

It is.  I really don't yet fully understand why this is better than just
stack-based memory management, or even reference counting.

> > I don't like leaving my program bloated
>
> Simply restrict your process to have a certain maximum memory size. In
> Ruby, limiting the size is an external task: you need to tell the OS
> to set the limit, e.g.: in UNIX you can use the command 'limit' before
> starting up Ruby. OTOH, Java interpreter does an internal
> limiting. Anyway, you set how 'bloated' your program can be, and
> forget it.

Well...I don't know how "bloated" my program is going to be.  Sometimes
things sit quiet and use very little memory.  Sometimes things get hairy and
for a very short time, something *could*, conceivably, use every bit of
memory available.  When that happens, I want it to shrink back up
immediately after the process is finished.  But I don't want to limit my
Ruby applications...if they need memory, they need memory.

I know I can call the garbage collector...but that requires that my code
recognizes when things are getting heavy.  That means *I* am managing
memory.  I would rather that each method be able to exit completely cleanly,
with every object created during its call gone and out of memory.  I can
call the collector every time, but isn't that a rather large performance hit
to be calling it every time my methods exit?

> > Also, a related issue...why isn't there a finalize call?  I don't mean
the
> > Is that in the works or is that just not the Ruby way?
>
> Finalizer is provided by ObjectSpace#define_finalizer. Look back the
> archive to learn more why the finalizer is outside the object
> (keyword: define_finalizer).

I believe define_finalizer calls a finalizing method *after* the object
dies, so this won't work.  A true destruction mechanism is called in the
context of the healthy object, allowing it to perform exit tasks using
resources it may have allocated or references to other objects/variables.

> Now on how GC can be faster. 2 years ago I had to construct a
> forgiving HTML parser. I ended up making four versions: 2 Java and 2
> C. Of those 4, 1 Java and 1 C use object pooling, 1 Java use GC, and 1
> C uses explicit free() which is performed right after a variable is
> unused. All 4 use Lex or JLex. The rules for the lexer are all
> similar. In brief, here is the performance table for large HTML (> 1
> MB) document:
>
> C with object pooling < Java with GC < Java with object pooling < C
> with explicit free.
>
> The only time the performance of Java with GC was disappointing was
> when running it using the MS JVM. MS's JVM is so damned brain
> dead. The initial heap cannot be set to more than 32 MB. Setting the
> maximum heap size to 512 MB will not help at all because JLex produces
> an incredible amount of very short-lived object (String). For every
> object created, if the current heap is full, then MS' JVM performs a
> GC and increases the heap usage if necessary. This is as expected from
> a Java GC. However, what I want is to postpone GC as late as
> possible. Thus, I have to set the initial heap size to some large
> values which is possible under other JVM but MS'.
>
> In short, GC is a good idea. Its performance is reasonable and yet it
> frees programmer to do interesting things. It's most certainly faster
> than reference counting (which is a special case of explicit free()),
> and yet easier at the same time.

The thing is, I need my programs to shrink up and be as small as possible at
all times.  I am still trying to grasp how I do that.  With C++, Delphi,
Perl (my variables), I know that when a block is exited, the memory I used
(whether stack or explicitly free'd memory) is gone and my program is
shrunken, for the most part, to the size it started at (save for memory used
in the context of the start point).

Here's an example of what I'm afraid of.  A Ruby process starts running and
grows extremely large with unused objects, but it doesn't completely hog the
whole system so garbage collection isn't triggered by a memory failure.  A
few MB's of RAM are left.  Then my web server suddenly needs a large amount
of memory, but there's none available so it fails miserably.  Ruby didn't
know the web server needed memory, so it didn't perform garbage collection.
So, my web process dies because Ruby has dead objects floating around in
memory; that's what that scenario boils down to.

In the above scenario, the only way I can think of to prevent that is to
have coded something in my program that manages memory usage and calls
garbage collection at some point that I determine is a good time to clean
things up.  But that's not very automatic.

What am I missing?  When and how often does Ruby call for garbage
collection?

    Sean