What I would try is using a slave to replicate just the tables you need (actually the indexes if that were possible) and memcached to keep copies of all those objects. I've been using memcached for years and I can swear by it. But keeping indexes in memcached is not easy/ reliable to do and mysql would do a better job. So then you would query the slave DB for the conditions you need but only to return the ids. And then you would ask memcached for those objects. I've been doing something similar in my CMS and it has worked great for me. Here is an article that might explain better where I'm coming from [1]. And if mysql clusters make you feel a little dizzy simple slave replication and mysql-proxy [2] might help out too. Hope it helps, Adrian Madrid [1] http://blog.methodmissing.com/2007/4/24/partially-bypass-activerecord-instantiation-when-using-memcached/ [2] http://forge.mysql.com/wiki/MySQL_Proxy On Oct 27, 4:31 pm, Tom Machinski <tom.machin... / gmail.com> wrote: > Hi group, > > I'm running a very high-load website done in Rails. > > The number and duration of queries per-page is killing us. So we're > thinking of using a caching layer like memcached. Except we'd like > something more sophisticated than memcached. > > Allow me to explain. > > memcached is like an object, with a very limited API: basically > #get_value_by_key and #set_value_by_key. > > One thing we need, that isn't supported by memcached, is to be able to > store a large set of very large objects, and then retrieve only a few > of them by certain parameters. For example, we may want to store 100K > Foo instances, and retrieve only the first 20 - sorted by their > #created_on attribute - whose #bar attribute equal 23. > > We could store all those 100K Foo instances normally on the memcached > server, and let the Rails process retrieve them on each request. Then > the process could perform the filtering itself. Problem is that it's > very suboptimal, because we'd have to transfer a lot of data to each > process on each request, and very little of that data is actually > needed after the processing. I.e. we would pass 100K large objects, > while the process only really needs 20 of them. > > Ideally, we could call: > > memcached_improved.fetch_newest( :attributes => { :bar => 23 }, :limit > => 20 ) > > and have the improved_memcached server filter and return only the > required 20 objects by itself. > > Now the question is: > > How expensive would it be to write memcached_improved? > > On the surface, this might seem easy to do with something like > Daemons[1] in Ruby (as most of our programmers are Rubyists). Just > write a simple class, have it run a TCP server and respond to > requests. Yet I'm sure it's not that simple, otherwise memcached would > have been trivial to write. There are probably stability issues for > multiple concurrent clients, multiple simultaneous read/write requests > (race conditions etc.) and heavy loads. > > So, what do you think: > > 1) How would you approach the development of memcached_improved? > > 2) Is this task doable in Ruby? Or maybe only a Ruby + X combination > (X probably being C)? > > 3) How much time / effort / people / expertise should such a task > require? Is it feasible for a smallish team (~4 programmers) to put > together as a side-project over a couple of weeks? > > Thanks, > -Tom > -- > [1]http://daemons.rubyforge.org/