On Mon, Mar 21, 2005 at 08:26:28AM +0900, Lothar Scholz wrote:
> Hello Doug,
> 
> >> No, you have it backwards.  With Ruby, the shared portion between
> >> processes ends up being quite small.  The Ruby interpreter itself is
> >> pretty lightweight, and that is the only code that is shared.  All of
> >> the code that is loaded afterwards is duplicated per process.  So the
> >> RSS of each process ends up being within a few mb of the total RAM
> >> usage, approximately.
> 
> DB> ah, you're right, i sometimes forget that the requires are at runtime.
> 
> DB> from what eric mentioned earlier, it sounds like fcgi is spawning new
> DB> ruby processes instead of forking from a master ruby process. my fcgi
> DB> knowledge is pretty rusty, but if there was a way to do the fork model,
> DB> then we could get fcgi rails installs to use less memory.  that said,
> DB> since it's fcgi, you're already using a lot less memory, since you're
> DB> running a handful of rails fcgi processes fronted by a pool of httpds,
> DB> versus the normal cgi model where each httpd loads and runs their own
> DB> cgis.  so maybe it's a case of diminishing returns...
> 
> How should a forking modell help with memory here ?
> 
> The first time you get a GC run (and that happens very often) ruby
> walks the whole memory and sets flags in data and code (ruby code
> nodes are also collected by the GC). At this time the MMU of the CPU
> will do a copy on write. So only if you have a lot of large strings or
> read only arrays then you will win something. But i doubt that this is
> the usual case.

hi lothar,

i didn't include enough details before...

if you had a way to register what modules your app depended on, then you
could make sure to load those in the master ruby process that the fcgi
children would be forked from.  copy-on-write would then make the kids
use less memory than if each fcgi was spawned on the fly and loaded the
modules on their own.  that would be happening at a lower level than
ruby's GC, so i figured that would work.

but it sounds like you're saying that even if the code did that, the
first GC run in the child would walk the list of data/code nodes and
twiddle them in one way or another, causing copy-on-write to happen for
all nodes, and negating any benefit from being forked from a master
parent process.  did i read you correctly?  if so, that's interesting, i
never expected that...

doug

-- 
"Contrary to what most people say, the most dangerous animal in the
world is not the lion or the tiger or even the elephant.  It's a shark
riding on an elephant's back, just trampling and eating everything they
see." -- Jack Handey