Issue #16672 has been updated by jmreid (Justin Reid).


jeremyevans0 (Jeremy Evans) wrote in #note-8:
> `total_out` doesn't give you the full size of the output until after the input is fully processed.  So if there is an exception or other early exit from the block passed to `inflater` before the body is fully inflated, you can end up with an incorrect result.  Also, the `Content-Length` header inside the block would still be wrong.  You could remove it before the block and only set it on success after the block, that's probably the best way to handle it if you want to modify it.

Ah, understood!

----------------------------------------
Bug #16672: net/http leaves original content-length header intact after inflating response
https://bugs.ruby-lang.org/issues/16672#change-84496

* Author: jmreid (Justin Reid)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
When using net/http to make a request to a resource, the default request headers are the following (when you have ZLIB available):
`"accept-encoding"=>["gzip;q=1.0,deflate;q=0.6,identity;q=0.3"], "accept"=>["*/*"], "user-agent"=>["Ruby"]`

This means that a resource will return a gzipped response if it can provide it. Take this URL for example:
`https://storage.googleapis.com/justin-reid-test/test.js`

This is a JS file that has a `content-length` of `2733` when gzipped and `9995` when inflated:

```
curl "https://storage.googleapis.com/justin-reid-test/test.js" -H "accept-encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3" | wc -c
2733

curl "https://storage.googleapis.com/justin-reid-test/test.js" | wc -c
9995
```


When making a simple request for this asset using net/http:
```
uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)
```

Ruby will (https://github.com/ruby/ruby/blob/f08cd708b11dd5b293986b92bb5e227731665b36/lib/net/http/response.rb#L264-L278):
- Delete the `content-encoding` header
- inflate the body
- return the inflated body

The issue here is that Ruby also leaves the `content-length` header set to the original request's value:
```
require 'net/http'

uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)

puts "Fetching: https://storage.googleapis.com/justin-reid-test/test.js"
puts "Body size using String#bytesize: #{res.body.to_s.bytesize}"
puts "Content-Length response header: #{res.content_length}"
```

Results in:
```
Fetching: https://storage.googleapis.com/justin-reid-test/test.js
Body size using String#bytesize: 9995
Content-Length response header: 2733
```

This means that an incorrect `content-length` header is passed back when net/http makes requests for gzip objects and inflates them. 


This issue was noticed when Rack changed their behaviour in how they compute content-length. They used to compute the content-length for each body, but that changed in 2.0.8:
https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e#diff-10b933d2c1fdc82ceecade456c64e1c2L92
https://github.com/rack/rack/issues/1472#issuecomment-574362342

Using `Rack::ContentLength` is now the method they prefer if you need to compute the content-length. However, `Rack::ContentLength` will not try to re-compute the value if that header already exists:
https://github.com/rack/rack/blob/6196377654b7ff7ce7abaecea62bb285d77d53aa/lib/rack/content_length.rb#L21

Should Ruby:
- Do a `self.delete 'content-length'` in the inflater?
- Compute the `content-length` itself and update the header? (Hacky example: https://github.com/ruby/ruby/compare/master...jmreid:content-length)





-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>