On Wednesday 30 November 2005 9:22 am, TeslaOMD wrote:

I've been doing web apps written in Ruby for businesses for years, now, based 
on the IOWA framework, and although none of them to date have been designed 
for the kind of scaling you describe, a couple are designed to scale across 
multiple machines.  Now that you know where I am coming from....

> I'll try to oversimplify this in the interest of this post not being
> more of a book than it already is. The software is actually a new idea
> that I'm not going to reveal for creative reasons, but it is probably
> closest to some of the existing social networking/dating apps. The  app
> must support over a million users (yes, this is realistic if not a
> gross underestimate if all goes well). We expect to have several
> thousand concurrent users. AJAX and querying DB data will play a heavy
> role in the system. Users will have lots of preferences/settings to
> manage/configure, most of which will be stored to disk as xml
> (serialized objects perhaps?) rather than the database.
>
> Our reason for selecting Ruby is because we want to continue to bring
> more attention to what we feel is a wonderful language (thank you Rails
> and others for contributing already) and also since we want to reduce
> our time to market as much as possible. We also want to try to give
> back some of our end results to the Ruby community in terms of code.
>
> Our development team is very experienced in web apps, but new to the
> Ruby world as far as serious applications. We are trying to find out
> more about what is already out there, needs to be done, and what
> limitations we might face. We've done extensive research but I would
> like some opinions directly from rubyists, rubycons, whatever. We'd
> like to scale horizontally and vertically and leverage cheap hardware
> to start until the site gets going more.
>
> Current Specs/Design:
>
> Our design uses MVC/n-tier. Business logic should be able to run
> independently on its own servers and not care about the web. Some of
> our basic design/plans are as follows:
>
> 1. Web Server -- Lighthttpd - seems to work well with FastCGI, very
> quick. Load balanced to send user to best server. Dedicated servers for
> things like serving images.

I'm using it for an application used by a Fortune 500 company (lighttpd + 
fastcgi), and am very happy with it.  Very fast, and reliable.

> 2. FastCGI or SCGI - We would like to replace FastCGI with something
> else if possible since we have concerns about all of our processes
> being constantly occupied by AJAX polling back to server code. We're
> not entirely convinced FastCGI is a great architecture for us but if we
> do use it, we would like to scale it accross many servers and use a
> SessionID to bind a user to a server. I worry about running out of
> available FastCGI processes, even with multiple machines.

As others have mentioned, the multiprocess model does scale pretty well.  My 
solution to the task of binding a user to a specific backend in a way that 
will scale across n machines is to use fcgi only as an intermediary.

The request hits lighttpd.  lighttpd passes it along to the FCGI program.  
That program looks at the session ID, which has encoded into it the specific 
server/process that is handling that user's session.  The FCGI program passes 
the request on to the correct server/process for handling.  If there is no 
session ID, it selects a server/process from the available pool.  This can be 
a simple random selection, or there could be a more sophisticated technique 
employed that would try to do intelligent allocation.

The backend processes are themselves multithreaded using Ruby threads (I have 
abused Ruby threads pretty heavily while stress testing, and for this sort of 
thing, I have had absolutely no strange deadlocks or failures), and one can 
run n processes per server.  Between the two layers one has a lot of 
flexibility for tuning.  One can add more frontend machines (lighttpd + fcgi) 
or backend machines, or alter the number of fcgi or backend process 
independently.  I am pretty confident that it is a cheaply scalable 
architecture that can handle large numbers of requests per second and 
concurrent requests, though this client is still using the app with only very 
modest traffic, so I don't have anything more than speculation to offer, 
here.

> 3. Database -- MySQL or Postgres - We need transaction support and data
> integrity -- doubts about MySQL in these areas. We also need good join
> performance - database is heavily normalized, may denormalize if
> needed. Separate DBs for things like Logging/Auditing vs. Content. Want
> to cluster DBs. We want connection pooling. Considering making some
> modifications to DBI.  Use stored procedures in the database, no sql in
> the code/dynamic sql. I loathe O/R mappers for complex databases and
> they would likely bring our system to a screeching halt. Unfortunately
> we cannot afford to use Oracle right now.

Depending on the OR mapper, it may work just fine with your db schema, but 
yeah, OR mapping has a runtime performance cost.  It's a matter for debate 
whether the performance cost is work the development time savings.  In many 
(most, I think) cases, it is, but maybe on your app it is now.

As for MySQL/Pg, there are plenty of success stories with either.  It's mostly 
a preference/religious issue, unless one really needs a feature that isn't 
available in both.

What modifications are you thinking about for DBI?  We have a team in place 
now who will again be maintaining and working on DBI, and those I can't speak 
for the others, I'd love to hear what you are thinking of doing.  Maybe start 
a separate thread to discuss this?

> 7. Sessions - Separate server for managing Sessions if possible. I
> would like to persist things in memory and share if possible. If not
> possible, we'll settle for persisting info to the database. Been
> looking at Session affinity for Ruby some.

This is really completely dependent on how you do your backend.  You can use 
Drb, but that does leave you with a single point of failure, and across your 
entire heavily used application, that single point could become a performance 
bottleneck.

You could keep each user's session associated only with a specific 
server/process.  No Drb bottleneck potential there, but if that process goes 
away, so does the session.

You could meet in the middle on that.  Have all of your backends processes on 
a single machine store sessions in a Drb process, or have mini clusters of 2 
or 3 machines where all the backends on that mini cluster uses the same Drb 
process to store sessions.

> 11. Events/Threading - I am worried about Ruby here, especially after
> reading about Myriad's problems with libevent. We will definitely need
> something similar to delegates and events and  some good queueing and
> threading functionality.

Ruby threading by itself really is solid.  I have a _LOT_ of processes running 
which make use of threading, and since 1.8.x versions of Ruby, I have never 
had any problems with it.  The Myriad/libevent issue with it boiled down to a 
simple implementation conflict, and isn't an indictment of Ruby threading's 
general usefulness.

> Concerns: Our main concern is of course scalability. Our AJAX controls
> (many already finished) will need to poll the server in some cases over
> a specified interval. We know that this is going to create some new
.
.
.
> For instance, a purely hypothetical example might be we have a control
> that lists online users along with status information.The controls
> would need to requery information every 20 seconds to obtain fresh
> information about the users (are they online? what is their current
> mood? what did they last do?). This means that our server is going to

I'm looking at the logs for one of my apps that makes heavy use of AJAX.  Most 
of them take between 2 and 6 thousandths of a second of time to run in the 
backend (IOWA) process.  Your uses for AJAX might be slower than that, or 
they might not be, but the point is that AJAX type requests typically 
transmit and receive limited amounts of information, and trigger only very 
specific processing tasks.  They are pretty fast.  If you have 4000 
concurrent users, and each of their browsers is kicking of an AJAX request 
every 20 seconds, you'll have around 200 requests per second. That does not 
seem like it should be too hard to support with a scalable architecture.  
Just keep your AJAX usage as efficient as possible, and pay attention to how 
that query interval affects performance.  If 30 second intervals saves you X 
requests/second and thus Y machines, are 20 second intervals worth it?

> The read-write ratio in the app is probably about 70% read 30% write.
> At any one time however we can expect the app is doing a significant
> amount of writes to the DB but not compared to the amount of reads, so
> keep this in mind. Our entire design needs to factor this into the
> equation.

This may or may not skew your database selection.  MySQL is particularly fast 
at reads, as opposed to writes.


Hope this helps.

Kirk Haines