--11Y7aswkeuHtSBEs
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Aug 04, 2006 at 04:59:36PM +0900, Helge Elvik wrote:
> Hi,
> 
> I'm currently implementing a database application in Ruby, and I use
> PostgreSQL for the data store. At first I tried using DBI to make the
> data store as platform/database independent as possible, but when I came
> back to work this morning I found that my computer was starved for
> virtual memory, and that Ruby was using about 700MBs of RAM and rising.
> 
> What I found out is that DBI slurp reads the whole result set into
> memory before it sends it back from the library. I poked around in the
> code, but found no way to instead return one-and-one result serially as
> the server sends them back.
> 
> So I poked around in the documentation for the postgres-library, but
> sadly with no luck there either. No method there accepts a block so that
> you can just receive the results serially. I also had a quick poke at
> postgres-pr, but the situation seems to be the same there.
> 
> Am I missing something, or is the really now way of receiving the
> results serially without the libraries slurping everything up in RAM
> first? The table I'm having trouble with currently has about 2 million
> rows, so reading everything into RAM is absolutely impossible. I'm
> looking for something like this:
> 
> db.execute("select * from huge_table").each_row do |row|
> 	# process row here
> end
> 
> PS: Reading these 2 million rows is not a normal day-to-day operation,
> but I do this when I normalize some data from a CSV file. Basically I
> restructure the data into several tables, and for now I've mostly done
> this with "CREATE TABLE t AS SELECT". I've found that I need to compute
> some of the columns with a Ruby-function though, so I basically need to
> loop through them all.
> 
> Regards,
> Helge Elvik

There might be another solution: go back to the DB server, and do your
computation DB-side with an embedded procedure. This would prevent
slurping any data at all to the client and back, and if your
computation is relatively "light" might even reduce load on your
server since it doesn't have to stream 2M rows back and forth.

Postgres has excellent support for embedded procedures in multiple
languages, even in Ruby I believe, although I've never done that
myself yet.

YMMV depending on your problem.

-Jgen

-- 
 The box said it requires Windows 95 or better so I installed Linux

--11Y7aswkeuHtSBEs
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)

iQEVAwUBRNyOjvy64gyiEfXtAQIazQf/cHb4yCjFdtdIFgxj3AYKyaVzK4Xf1XO3
2wzUswj9tqWFYlvfJ9wdLtqWDGug45Kr8MY5IunMwvNQo9inzarE95Yvbl/MhX7z
LawPP7RyRPoG0EsVl844TeZMh4UfPqqqIJ7+EdkE9xahgicQhM5t/vTPsdG0cXmb
EL51wWfSZKBj2wOFTcjVyAXsPHI4dN6+RrunJ8kv1M8y6NB9Otit5IJ3xKIqRkWy
GfJ0lIByORkABiX8eaM+uR+Rt8/MQ8nq2YOcSArqIvNxRU9Ha+vjv+Usqp8tK9p5
IEF2rOYxXbkn4LVyplCkF9NXa5ysfhWLE2ys60rrsMu9JkyK52RRsQ1fm
-----END PGP SIGNATURE-----

--11Y7aswkeuHtSBEs--