I'm reposting this because I've had little response to this version
of the patch, which still works against the latest snapshot:
brains hgs 60 %> md5sum ruby-1.9-today.tar.gz
ea0cf2a5316288135ff5112c035cab4f ruby-1.9-today.tar.gz
brains hgs 61 %> ls -ld !$
ls -ld ruby-1.9-today.tar.gz
-rw-r--r-- 1 hgs staff 5482936 Oct 14 20:02 ruby-1.9-today.tar.gz
brains hgs 62 %>
This patch modifies lib/net/http to request compression of HTTP
requests by default, with the possibility of turning this off (by
means of a "Accept-Encoding: identity" header), and the patch also
provides the code to detect if "Content-Encoding:" has been set, and
respond accordingly. It is known to work for gzip now, (tried with
msn.com) but I can't find a server delivering Deflate-d content to
confirm that Deflate works as well. The intent of this patch is to
reduce the bandwidth demands of ruby applications making use of HTTP
with the least effort from future users of ruby. Help with testing,
refinements, etc would be welcome.
Hugh
--- ./lib/net/http.rb.orig 2007-10-07 09:53:06.000000000 +0100
+++ ./lib/net/http.rb 2007-10-09 12:35:00.104128000 +0100
@@ -27,6 +27,8 @@
require 'net/protocol'
require 'uri'
+require 'zlib'
+require 'stringio'
module Net #:nodoc:
@@ -477,6 +479,7 @@
@use_ssl = false
@ssl_context = nil
@enable_post_connection_check = true
+ @compression = nil
end
def inspect
@@ -740,7 +743,18 @@
public
# Gets data from +path+ on the connected-to host.
- # +header+ must be a Hash like { 'Accept' => '*/*', ... }.
+ # +initheader+ must be a Hash like { 'Accept' => '*/*', ... },
+ # and it defaults to an empty hash.
+ # If +initheader+ doesn't have the key 'accept-encoding', then
+ # a value of "gzip;q=1.0,deflate;q=0.6,identity;q=0.3" is used,
+ # so that gzip compression is used in preference to deflate
+ # compression, which is used in preference to no compression.
+ # Ruby doesn't have libraries to support the compress (Lempel-Ziv)
+ # compression, so that is not supported. The intent of this is
+ # to reduce bandwidth by default. If this routine sets up
+ # compression, then it does the decompression also, removing
+ # the header as well to prevent confusion. Otherwise
+ # it leaves the body as it found it.
#
# In version 1.1 (ruby 1.6), this method returns a pair of objects,
# a Net::HTTPResponse object and the entity body string.
@@ -774,10 +788,31 @@
# end
# }
#
- def get(path, initheader = nil, dest = nil, &block) # :yield: +body_segment+
+ def get(path, initheader = {}, dest = nil, &block) # :yield: +body_segment+
res = nil
+ unless initheader.keys.any?{|k| k.downcase == "accept-encoding"}
+ initheader["accept-encoding"] = "gzip;q=1.0,deflate;q=0.6,identity;q=0.3"
+ @compression = true
+ end
request(Get.new(path, initheader)) {|r|
- r.read_body dest, &block
+ if r.key?("content-encoding") and @compression
+ @compression = nil # Clear it till next set.
+ the_body = r.read_body dest, &block
+ case r["content-encoding"]
+ when "gzip"
+ r.body= Zlib::GzipReader.new(StringIO.new(the_body)).read
+ r.delete("content-encoding")
+ when "deflate"
+ r.body= Zlib::Inflate.inflate(the_body);
+ r.delete("content-encoding")
+ when "identity"
+ ; # nothing needed
+ else
+ ; # Don't do anything dramatic, unless we need to later
+ end
+ else
+ r.read_body dest, &block
+ end
res = r
}
unless @newimpl
@@ -2260,6 +2295,12 @@
read_body()
end
+ # Because it may be necessary to modify the body, Eg, decompression
+ # this method facilitates that.
+ def body=(value)
+ @body = value
+ end
+
alias entity body #:nodoc: obsolete
private