On 2013-01-14, at 6:33 AM, Eliezer Croitoru <eliezer / ngtech.co.il> =
wrote:

> On 1/11/2013 3:10 PM, Robert Klemme wrote:
>> On Thu, Jan 10, 2013 at 9:36 PM, Eliezer Croitoru =
<eliezer / ngtech.co.il> wrote:
>>=20
>>> So I am wondering about other DB solutions that will not use too =
much
>>> storage space will be fast and can scale if needed.
>>=20
>> Scale in what direction?  Database size, requests per second, =
multiple
>> concurrent clients, multiple physical nodes the DB is stored on...?
>>=20
>> Kind regards
>>=20
>> robert
>>=20
> Thanks Robert,
>=20

Hi,

Here are a bunch of questions to ask yourself that cover off things that =
I've found helpful to know.

You might think about why you need to move from your combination of =
TokyoCabinet and Redis. In particular, why not Redis. It's a little more =
clear why you'd move from TokyoCabinet, I've used it happily for years =
myself, so I can imagine a bunch of reasons.


> The scale is in couple directions:
> - multiple physical nodes the main DB stored on.

Is this for reliability or performance? This sounds like a solution not =
a requirement.

> - Database size

How big do you think it'll be?

> - requests per second

What kind of request rate are you thinking?

Do you care about latency? (you should) Throughput and latency are =
pretty much independent variables when it comes to databases.

> - master and secondary updates\replication

Again, this sounds like a solution not a requirement. What's the issue =
that makes you say this?

What is your read/write ratio? What is your write rate? Are you updating =
or writing new data, and what's the ratio of update to write? Do you =
need secondary indexes? How many, what kind?

If you write to the master then replicate there'll be a time period =
where the various nodes will provide different results. Can your =
application tolerate this? or do you need some kind of stronger =
consistency constraint?

Are your updates/writes exposing you to consistency issues? (i.e. do you =
need transactions?) If you update (or even write) multiple records, it's =
possible that the updates arrive in an essentially random order to the =
replicas, and possibly in a different order to the different replicas.

>=20
> For now one machine will host the DB while it gets updates from couple =
sources such as human and other auto-testing tools.
> This will be a dedicated DB machine while there are others servers =
which gets updates from the master DB when needed.
> The problem is that the updates are live and should be replicated with =
the smallest delay possible.

What does "when needed" mean given that the updates should be "as soon =
as possible"? I'm thinking that this master/slave setup you're thinking =
of is lifted from how you'd do it with TokyoCabinet or Redis. Things =
like Cassandra or Riak or HBase don't do it that way.

Are you ever going to have to scan your whole database? How often?

Cheers,
Bob

>=20
> Thanks,
> Eliezer