On Wed, Feb 28, 2007 at 07:08:32AM +0900, Ben Johnson wrote:
> I am making a web service call and getting back very large responses
> (sometimes 5gb). When I get this response it eats all of my RAM. I need
> to read the response in chunks so I can store it in a file and I have no
> idea how to do this. Any help is greatly appreciated.
> 
> Here is my code:
> 
> 
> require 'soap/wsdlDriver'
> 
> soap = SOAP::WSDLDriverFactory.new("some url").create_rpc_driver
> soap.wiredump_file_base = "soapfile"
> 
> response = soap.GetWhatever(:whatever => "whatever)
> 
> 
> Ironically, when reading the response it doesn't dump it into the file
> until it gets the entire response into memory, this is what's killing my
> server. Is there a more efficient way of doing this?

You just want to get the whole response into a file? Then I'd suggest:

1. build the SOAP XML request as a string

2. connect to the server using HTTP

3. post the XML you built in step 1

4. read the response as a stream and write it to a file.

To get the response as a stream, you can probably still use Net::HTTP for
this. If the response from the server is chunked (use tcpdump to check
this), you can call HTTPResponse#read_body with a block, and you will get
the chunks passed to you in turn. The following example is given in the
documentation:

     # using block
     http.request_post('/cgi-bin/nice.rb', 'datadatadata...') {|response|
       p response.status
       p response['content-type']
       response.read_body do |str|   # read body now
         print str
       end
     }

If the response is not chunked, then just pull out the @socket from the
object and read(65536) it in a loop.

If you want to *parse* the response on the fly, then you could use rexml in
stream parsing mode: see
http://www.germane-software.com/software/XML/rexml/docs/tutorial.html and
scroll down to "Stream Parsing"

You then may need an IO.pipe or similar object which accepts the HTTP chunks
on one side and gives a readable stream on the other.

But this may still be a problem if your 5GB response consists mainly of a
single element, <some-tag>...5GB of data...</some-tag>. I'm not sure if
REXML will call text() with blocks, or will try to slurp the whole 5GB in
before calling text() once.

HTH,

Brian.