Issue #6492 has been updated by naruse (Yui NARUSE).


drbrain (Eric Hodel) wrote:
> naruse (Yui NARUSE) wrote:
> > drbrain (Eric Hodel) wrote:
> > > Due to read_chunked, and persistent connections I don't see how to make this work.
> > 
> > Yeah, I thought adding an another layer, transport encoding decoder, but it is just an idea and I don't suggest it.
> 
> I had this idea too, but it would be a larger change. I hope we can create something simpler, but also usable.
> 
> > > > This read method return a string whose length is not clen, this is wrong.
> > > > Other IO-like object for example Zlib::GzipReader returns a string whose length is clen.
> > > > So Inflater should have a internal buffer and return the string whose length is just clen.
> > > 
> > > Upon review, I think this is OK.
> > > 
> > > RFC 2616 specifies that Content-Length and Content-Range (which are used for clen) refer to the transferred bytes and are used to read the correct amount of data from the response to maintain the persistent connection.  Net::HTTPResponse#read_body doesn't allow the user to specify the amount of bytes they wish to read, so returning more data to the user is OK.
> > 
> > Your patch hides content-encoding layer.
> > Content-Length and Content-Range are the member of the layer.
> > 
> > Net:HTTPRequest#read is on the layer.
> 
> I'm confused.
> (snip)
> Since Net::HTTPResponse is not usable as an IO I don't think IO-like behavior should apply to Net::HTTPResponse::Inflater.

Ah, yes, you are correct, it is only for Net::HTTPResponse::Inflater.

> > A user of net/http can't know whether a request used content-encoding or not.
> 
> I am unsure what you mean by "can't".
> 
> Do you mean "a user of net/http must be able to tell content-encoding was present"?

Yes.

> When this patch is combined with #6494 they will not be able to know whether a request used content-encoding or not.  I think this is good, the user should not have to worry about how the bytes were transported.  (This behavior matches Net::HTTP#get).

Yeah, I think it is acceptable.

> If we want the user to be able to handle content-encoding themselves I think adding a Net::HTTP#compression = false (which will disable both #6492 and #6494) would be best.  We can also add an indicator on Net::HTTPResponse that decompression was performed.

Someone may want such function, but it is not required until someone actually request.

> For Content-Length with Content-Encoding, the Content-Length will be invalid.  I think this is OK because RFC 2616 doesn't contain an indicator of the decoded length and the user is most likely interested in the decoded body.

It is OK for an HTTP client itself because Content-Length is for the reader of socket.
A client can know how many bytes should it read by Content-Length.
For this purpose, Content-Length must show the compressed size.

> Content-Range with Content-Encoding requires special handling.  The compressed stream may start anywhere in the underlying block.  (For a deflate-based stream the user would need to manually reconstruct the complete response in order to inflate it.)  I think such users should disable compression.

So on range response with content-encoding, the response won't uncompress the body
and keep Content-Encoding field, right?
If so, I agree.

> > On such situation, it can't be a reason why hidden Content-Encoding layer effects the behavior of read method.
> 
> I agree that in RFC 2616 that Content-Encoding, Content-Length and Content-Range are all on the same layer, but without an IO-like interface for the Net::HTTPResponse body I don't think a restriction on the behavior of the read method should apply.  Since this API is entirely internal, I think it is OK if a future addition to the API needs to add buffering to be IO-like.

OK, I agree if the read method has a comment which explain it breaks the manner of IO-like object
because it is internal API.
----------------------------------------
Feature #6492: Inflate all HTTP Content-Encoding: deflate, gzip, x-gzip responses by default
https://bugs.ruby-lang.org/issues/6492#change-27101

Author: drbrain (Eric Hodel)
Status: Assigned
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: lib
Target version: 2.0.0


=begin
This patch moves the compression-handling code from Net::HTTP#get to Net::HTTPResponse to allow decompression to occur by default on any response body.  (A future patch will set the Accept-Encoding on all requests that allow response bodies by default.)

Instead of having separate decompression code for deflate and gzip-encoded responses, (({Zlib::Inflate.new(32 + Zlib::MAX_WBITS)})) is used which automatically detects and inflated gzip-wrapped streams which allows for simpler processing of gzip bodies (no need to create a StringIO).
=end



-- 
http://bugs.ruby-lang.org/