On 10/31/06, Bjorn Borud <borud-news / borud.no> wrote:
>
> I have the following code:
>
> def fetch_into (uri, name)
>   http = Net::HTTP.new(uri.host, uri.port)
>   req = Net::HTTP::Get.new(uri.path)
>   req.basic_auth(USERNAME, PASSWORD)
>   start_time = Time.now.to_f
>   File.open(name, "w") do |f|
>     print " - fetching #{name}"
>     http.request(req) do |result|
>       f.write(result.body)
>       f.close()
>       elapsed = Time.new.to_f - start_time
>       bps = (result.body.length / elapsed) / 1024
>       printf ", at %7.2f kbps\n", bps
>     end
>   end
> end
>
> this is run in a very simple loop that doesn't do anything that
> requires much CPU.  the files downloaded are about 10Mb and since the
> connection is not that fast (about 15Mbit/sec) I would expect this to
> consume little CPU, but in fact it *gobbles* up CPU.  on a 2Ghz AMD it
> eats 65% CPU on average (the job runs for hours on end).
>
> where are the cycles going?  I assumed it would be a somewhat
> suboptimal way of doing it since there might be some buffer resizing
> in there, but not *that* badly.
>
> anyone care to shed some light on this?
>
> (I would assume that there is a way of performing an http request in a
> way where you can read chunks of the response body at a time?)

Hi,
there seems to be HTTPResponse#read_body, that can provide the chunks
as they come (not tested, copy&paste from docs:

 # using iterator
  http.request_get('/index.html') {|res|
    res.read_body do |segment|
      print segment
    end
  }

BTW, you could move the File.open later, saving f.close() call
try fiddling with GC - GC.disable when receiving might help or not.
don't forget to enable it between requests.

so

def fetch_into (uri, name)
  http = Net::HTTP.new(uri.host, uri.port)
  req = Net::HTTP::Get.new(uri.path)
  req.basic_auth(USERNAME, PASSWORD)
  start_time = Time.now.to_f
  print " - fetching #{name}"
  # GC.disable # optional
  http.request(req) do |result|
    File.open(name, "w") do |f|
      result.read_body do |segment|
        f.write(segment)
      end
    end
    elapsed = Time.new.to_f - start_time
    bps = (result.body.length / elapsed) / 1024
    printf ", at %7.2f kbps\n", bps
  end
  # GC.enable
end