On Thu, Oct 04, 2007 at 07:10:05AM +0900, Jay Levitt wrote:
> On Thu, 4 Oct 2007 04:40:52 +0900, Chad Perrin wrote:
> 
> >> No argument there, as long as it's understood that there are limits to
> >> what can be achieved.  I don't want to discourage anyone from seeking
> >> linear scalability as an ideal, but it's not a realistic thing to
> >> promise or assume.
> > 
> > It's close enough (again), for many purposes, to "realistic".  When you
> > can get roughly linear scaling up to 100 times as much scaling needs, as
> > opposed to trying to get similar scaling capabilities out of throwing
> > programmers (or programmer time) at the problem, that's certainly
> > "realistic" in my estimation.
> 
> A lot depends on your application requirements.  If you design it from the
> ground up to be "shared nothing", then you may well be lucky enough to
> truly HAVE shared nothing.  But you'll also have a pretty limited feature
> set.

"Feature-rich" is overrated.  Anyone who tries to be everything to
everyone will end up being not the right thing to pretty much everyone.
You only get into the kind of trouble you describe when you try to hard
to get *everyone* interested.


> 
> What's the big buzzword today?  Social networking.  What did we used to
> call that?  "Community."  What was the single biggest sticky-paper
> community feature?  Buddy lists.  Who does buddy lists besides the Big Guys
> (who can throw money at it) and the really small guys (who fit on a single
> server)?  Nobody.  Why?  Doesn't scale linearly.  Think about what it takes
> to offer a feature that, for every simultaneous user, checks the list of
> every other simultaneous user for people you know.  Shared-nothing *that*.

The answer to that, from where I'm sitting, is to choose between focusing
on "social networking" and focusing on something else.  If you're just
adding it as "yet another feature" to your application to become
buzzword-compliant, you'll become another dot-com startup has-been.  Of
course, there's also always the business strategy of "look successful,
sell to someone big" without actually turning a positive buck along the
way -- and if that's what you want to do, you're on your own.


> 
> My area of expertise was the AOL mail system.  And, looking back, there
> were a number of core features we offered that simply couldn't be done in a
> shared-nothing world over slow phone lines:

<snip a bunch of stuff about features>

> 
> And some of the features were only important in an age where pipes (both
> last-mile and LAN) were very narrow and disks and RAM were very small.
> Spam, in particular, made the "one copy of each message" model obsolete,
> because spammers wouldn't play by the rules.  
> 
> But restricting yourself to only shared-nothing features means ruling out
> an awful lot of features.  Including anything depending on a database
> index, or a table that fits completely in memory, or any sort of
> rate-limiting or duplicate-detection or spam prevention, or in fact
> anything that makes any assumptions at all about the state of any database
> you're interacting with or relational integrity or any other transaction in
> the system, ever.  Including whether the disk drive holding the transaction
> you just wrote to disk has disappeared in a puff of head crash.

You don't always have to write shared-nothing code to get near-linear
scalability -- and it's true that near-linear scalability is something
that only exists within certain ranges before you hit a cost or resource
requirement spike, but if you're smart you plan ahead for those kinds of
things.  Things don't always go as planned, of course, but if you're
smart you plan for *that*, too, by setting aside "money for a rainy day"
and ensuring that, short of your main datacenters and every off-site
backup in the world being eliminated by meteor strikes simultaneously,
any major scaling issues will not require a sudden "right now" fix.


> 
> It was always the little things that bit us.  Know why AOL screen names are
> often "Jim293852"?  Well, it started out as "The name 'Jim' is already
> taken.  Would you like 'Jim2'?".  Guess how well that scales when the first
> available Jim is "Jim35000"?  Not very.

I'm curious how any of this is meant to support the position that a
faster-executing programming language that imposes greater hurdles on
programmer productivity will be a better investment for scalability than
designing a system that can absorb greater loads by adding hardware
resources.


> 
> Pop-quiz:  Which of *your* core features would you have to eliminate with
> three million simultaneous users?  

Hopefully, by the time you have that many users, you're making enough
money to be able to manage that many users.  If not, your business model
sucks.

-- 
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]
W. Somerset Maugham: "The ability to quote is a serviceable substitute for
wit."