Issue #6492 has been updated by drbrain (Eric Hodel).

File net.http.inflate_by_default.2.patch added

I've updated this patch.  Upon working with the code again and looking at RFC 2616, I have made the following changes:

> naruse (Yui NARUSE) wrote:
> > If Inflater's @socket.read returns nil or a string shorter than clen, it means the input is finished and @inflate can finish.
> > So at that time, you can call @inflate.finish.
>
> I hadn't thought of that, I will implement it.

Due to read_chunked, and persistent connections I don't see how to make this work.

When reading the body's Content-Length or Content-Range this strategy would work, but read_chunked reads multiple chunks of the compressed body and indicates the input to inflate is finished with a terminating "0\r\n\r\n" on the raw socket.  Adding this communication between the raw socket and Inflater seems worse.

When the connection is persistent, #read should only return nil when the connection was abnormally terminated in which case we will throw away the body.

For #read_all, this would work.

Due to all the special cases, I changed Net::HTTPResponse#inflater to yield the Inflater and automatically clean it up.  This keeps the special information about cleanup out of #read_body_0

> this variable inflater is confusing with the inflater method.

In Net::HTTPResponse#read_chunked, the confusing "inflater" variable has been replaced with "chunk_data_io" which comes from RFC 2616 section 3.6.1.

> This read method return a string whose length is not clen, this is wrong.
> Other IO-like object for example Zlib::GzipReader returns a string whose length is clen.
> So Inflater should have a internal buffer and return the string whose length is just clen.

Upon review, I think this is OK.

RFC 2616 specifies that Content-Length and Content-Range (which are used for clen) refer to the transferred bytes and are used to read the correct amount of data from the response to maintain the persistent connection.  Net::HTTPResponse#read_body doesn't allow the user to specify the amount of bytes they wish to read, so returning more data to the user is OK.

I have made an additional change beyond your review:

I've added a Net::ReadAdapter to the Inflater to stream of the encoded response body through inflate without buffering it all.  This will reduce memory consumption for large responses.
----------------------------------------
Feature #6492: Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default
https://bugs.ruby-lang.org/issues/6492#change-26895

Author: drbrain (Eric Hodel)
Status: Open
Priority: Normal
Assignee: 
Category: lib
Target version: 2.0.0


=begin
This patch moves the compression-handling code from Net::HTTP#get to Net::HTTPResponse to allow decompression to occur by default on any response body.  (A future patch will set the Accept-Encoding on all requests that allow response bodies by default.)

Instead of having separate decompression code for deflate and gzip-encoded responses, (({Zlib::Inflate.new(32 + Zlib::MAX_WBITS)})) is used which automatically detects and inflated gzip-wrapped streams which allows for simpler processing of gzip bodies (no need to create a StringIO).
=end



-- 
http://bugs.ruby-lang.org/