If your application can be written in Java, check Hibernate (
www.hibernate.org ). The learning curve is not insignificant, but IMO, well
worth it.

Matt


-----Original Message-----
From: Tim Bates [mailto:tim / bates.id.au]
Sent: Wednesday, January 07, 2004 5:07 AM
To: ruby-talk / ruby-lang.org
Subject: Database applications and OOness


People,
I'm going to write an application that needs to deal with reasonably
large amounts of data over a reasonably long period of time. So the
question that immediately comes up is, how do I store said data when the
program is not running? As I see it, the options can be divided into two
categories in two different ways: How the data is stored and how it is
interfaced.

The data can be stored in (essentially) one of two ways: in memory, or
on disk. In my case, in-memory systems are probably no good - with the
volumes of data I want to hold, and the periods of time I want to hold
them over (some if it is accounting data, which is likely to be kept for
years) I don't want it all in memory. Therefore, I will choose a
disk-based system such as an RDBMS. My RDBMS of choice is PostgreSQL, so
I'll run with that for now.

The other way my options can be divided is (and I'm simplifying things a
bit again) between those libraries that present the data as objects and
those that present it as database rows. Let me clarify:

#<Person:0x40132728 @name="name", @number="phone number",
@address="address">
  vs
["name","address","phone number"]

I like the first one better, because you get encapsulation and behaviour
with your data, but most of the database interface libraries that I've
seen of that type tend to take most of the querying out of the hands of
the user. They allow you to retrieve objects that have properties that
match a given example, but very seldom any more complex than that - the
equivalent of limiting yourself to "SELECT * FROM table WHERE
table.column = ?" type queries only. Part of the reason I like RDBMSs is
that I can do complex queries which don't necessarily return a simple
list of objects, like:

"SELECT s.name AS name, COUNT(DISTINCT t.customer_id) AS customers
 FROM salesperson s JOIN transactions t ON s.id = t.salesperson
 ORDER BY customers DESC;"

This really returns a salesperson-to-integer mapping of how many
customers a salesperson has dealt with. You can't represent this as a
plain list of objects, unless you created a special type of object just
for it (which is silly) and most OO-type database libraries won't let
you make that sort of query at all without going behind their backs,
which is really ugly.

The alternative is to use something like vanilla DBI, in which you write
your own SQL and everything comes back as arrays of values. This allows
you to do whatever queries you want but you lose the niceness of the OO
approach. I don't want to do vanilla DBI, it's too messy.

What I want is some sort of happy medium, and I'm starting to think I'll
have to write it as none of the database interface libraries I've
evaluated do it the way I want - which is the motivation for writing a
lot of software, is it not? I believe that's how Ruby itself got its
start, so I'm in noble company...

But also on a deadline. Any suggestions?

Tim Bates
-- 
tim / bates.id.au