Hi there,

I plan to do a fairly large (how large depends on the commercial
success) project that will be programmed completely in Ruby. I can't
talk about the specifics right now, but to give you an idea it will be
something like David Heinemeier Hansson's Basecamp except that it will
not have anything to do with project management, so no direct
competition (it will be completely german project anyway). If I remember 
correctly I read somewhere in his blog that basecamp is a medium sized 
project, so
when I say -large- project I actually mean a project which will start
out small and depending on the commercial success could get a very large 
user base. At least that is what I am hoping for :) It could of course 
completely fail, only time will tell.

I would like to ask a couple of questions on the scalability of these
kinds of web projects. I was thinking of writing to David directly but I
was hoping that if he (and others with that kind of experience) answers
my questions on the list everybody could profit from that.

Following are my conceptions of what I heard or have read about Ruby. If 
I am wrong with any of these, don't hesitate to correct me.

In my understanding the biggest problem in scaling Ruby (cpu wise) is 
that it doesn't have native thread support, yet. What this means in 
terms of a web application is that if you only have let's say 30 
concurrent users on a fairly new piece of Hardware this is not a 
problem. But what happens if your site suddenly gets very popular and 
you jump from 20 to 200 or even 2000 concurrent users? How do you scale 
such a web app? If you were to program this web app in Java or any other 
language which supports native threads you could simply throw more cpus 
and ram at it. I am thinking of a blade server here. The more users you 
get you simply stick another blade in your server and have your piece of 
mind. As I understand it you cannot do something like that with Ruby. 
Enter Distributed Ruby (DRb).

As Martin Fowler states in his first law of distributed object design: 
Don't distribute your objects! 
<http://c2.com/cgi/wiki?FirstLawOfDistributedObjectDesign>

Things can really get hairy when getting distributed. All kinds of 
things can go wrong. IMO it is kinda like the step from single threaded 
programming to multi threaded programming or worse. So it is always a 
good thing if you can avoid it. I can't even start to think about how I 
would (unit) test such a beast but as of now I don't see any real 
alternatives for scaling a Ruby app. Of course when done right it has 
many many benefits. One is that you can buy the cheapest hardware and 
plug them together like Google is doing it, but Google seems to have an 
armada of excellent programmers (so it seems) to handle the pontentially 
very difficult distributed stuff. I am only a single (maybe a little 
over ambitious) programmer. I have seen a couple of very nice and simple 
examples in drb (from Dave Thomas for example), but would you really 
advise using drb for some big time commercial web app?

I would be very curious what kind of strategy 37signals has with their 
Basecamp. Maybe David can elaborate on that if it is not a big secret. I 
am very eager to hear about specific choices from anyone who has 
similiar experience.

- What kind of hardware are you using?
- Where are the biggest performance bottlenecks in your environment?
- What is your hardware upgrade path in case active user numbers go 
through the roof?
- What kind of httpd do you use?
- What kind of framework are you using?

Thanks
-- 
Sascha Ebach