On 10/28/07, M. Edward (Ed) Borasky <znmeb / cesmail.net> wrote:
> Add "large set of very large (binary?) objects". So ... yes, at least
> *one* database/server. This is exactly the sort of thing you *can* throw
> hardware at. I guess I'd pick PostgreSQL over MySQL for something like
> that, but unless you're a billionaire, I'd be doing it from disk and not
> from RAM. RAM-based "databases" look really attractive on paper, but
> they tend to look better than they really are for a lot of reasons:
>
> 1. *Good* RAM -- the kind that doesn't fall over in a ragged heap when
> challenged with "memtest86" -- is not inexpensive. Let's say the objects
> are "very large" -- how about a typical CD length of 700 MB? OK ... too
> big -- how about a three minute video highly compressed. How big are
> those puppies? Let's assume a megabyte. 100K of those is 100 GB. Wanna
> price 100 GB of *good* RAM? Even with compression, it doesn't take much
> stuff to fill up a 160 GB iPod, right?

I might have impressed you with a somewhat inflated view of how large
our data-set is :-)

We have about 100K objects, occupying ~500KB per object. So all in
all, the total weight of our dataset is no more than 500MBs. We might
grow to maybe twice that in the next 2 years. But that's it.

So it's very feasible to keep the entire data-set in *good* RAM for a
reasonable cost.

> 2. A good RDBMS design / query planner is amazingly intelligent, and you
> can give it hints. It might take you a couple of weeks to build your
> indexes but your queries will run fast afterwards.

Good point. Unfortunately, MySQL 5 doesn't appear to be able to take
hints. We've analyzed our queries and there's some strategies there we
could definitely improve by manual hinting, but alas we'd need to
switch to an RDBMS that supports those.

> 3. RAID 10 is your friend. Mirroring preserves your data when a disk
> dies, and striping makes it come into RAM quickly.
>
> 4. Enterprise-grade SANs have lots of buffering built in. And for that
> stuff, you don't have to be a billionaire -- just a plain old millionaire.

We had some bad experience with a poor SAN setup, though we might have
been victims of improper installation.

Thanks,
-Tom