Issue #16672 has been updated by jmreid (Justin Reid).


ioquatix (Samuel Williams) wrote in #note-2:
> Are you using `res.content_length`? Why can't you just use `res.body.to_s.bytesize`?

`res.content_length` is just the `content-length` response header from the net/http request:

```
uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)

res.to_hash
```
```
{"x-guploader-uploadid"=>["AEnB2UrJSkEHDPO5iCRug2Qp0bzweRd8pd05d5eRq0fIUtRVZuibaGfQZhLHBN58g0ZW1N-qMcsUiZACutpDoObHTijYs3_DrUNOy8SH9HA1hhTW0RtIbco"],
 "date"=>["Thu, 05 Mar 2020 01:35:08 GMT"],
 "expires"=>["Fri, 05 Mar 2021 01:35:08 GMT"],
 "last-modified"=>["Wed, 04 Mar 2020 20:53:40 GMT"],
 "etag"=>["\"e5994ce974ae1b8e426810037812e7d5\""],
 "x-goog-generation"=>["1583355220705723"],
 "x-goog-metageneration"=>["2"],
 "x-goog-stored-content-encoding"=>["gzip"],
 "x-goog-stored-content-length"=>["2733"],
 "content-type"=>["application/javascript; charset=utf-8"],
 "x-goog-hash"=>["crc32c=VCx7Dg==", "md5=5ZlM6XSuG45CaBADeBLn1Q=="],
 "x-goog-storage-class"=>["STANDARD"],
 "accept-ranges"=>["bytes"],
 "vary"=>["Accept-Encoding"],
 "content-length"=>["2733"],
 "server"=>["UploadServer"],
 "cache-control"=>["public, max-age=31536000, immutable"],
 "age"=>["6"],
 "alt-svc"=>["quic=\":443\"; ma=2592000; v=\"46,43\",h3-Q050=\":443\"; ma=2592000,h3-Q049=\":443\"; ma=2592000,h3-Q048=\":443\"; ma=2592000,h3-Q046=\":443\"; ma=2592000,h3-Q043=\":443\"; ma=2592000"]}
```

The issue here is that this ` "content-length"=>["2733"]` value is 2733. The `res.body` at this point is 9995, so `content-length` needs to match that or not exist. Otherwise it causes browsers to only partially download the file.

----------------------------------------
Bug #16672: net/http leaves original content-length header intact after inflating response
https://bugs.ruby-lang.org/issues/16672#change-84490

* Author: jmreid (Justin Reid)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
When using net/http to make a request to a resource, the default request headers are the following (when you have ZLIB available):
`"accept-encoding"=>["gzip;q=1.0,deflate;q=0.6,identity;q=0.3"], "accept"=>["*/*"], "user-agent"=>["Ruby"]`

This means that a resource will return a gzipped response if it can provide it. Take this URL for example:
`https://storage.googleapis.com/justin-reid-test/test.js`

This is a JS file that has a `content-length` of `2733` when gzipped and `9995` when inflated:

```
curl "https://storage.googleapis.com/justin-reid-test/test.js" -H "accept-encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3" | wc -c
2733

curl "https://storage.googleapis.com/justin-reid-test/test.js" | wc -c
9995
```


When making a simple request for this asset using net/http:
```
uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)
```

Ruby will (https://github.com/ruby/ruby/blob/f08cd708b11dd5b293986b92bb5e227731665b36/lib/net/http/response.rb#L264-L278):
- Delete the `content-encoding` header
- inflate the body
- return the inflated body

The issue here is that Ruby also leaves the `content-length` header set to the original request's value:
```
require 'net/http'

uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)

puts "Fetching: https://storage.googleapis.com/justin-reid-test/test.js"
puts "Body size using String#bytesize: #{res.body.to_s.bytesize}"
puts "Content-Length response header: #{res.content_length}"
```

Results in:
```
Fetching: https://storage.googleapis.com/justin-reid-test/test.js
Body size using String#bytesize: 9995
Content-Length response header: 2733
```

This means that an incorrect `content-length` header is passed back when net/http makes requests for gzip objects and inflates them. 


This issue was noticed when Rack changed their behaviour in how they compute content-length. They used to compute the content-length for each body, but that changed in 2.0.8:
https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e#diff-10b933d2c1fdc82ceecade456c64e1c2L92
https://github.com/rack/rack/issues/1472#issuecomment-574362342

Using `Rack::ContentLength` is now the method they prefer if you need to compute the content-length. However, `Rack::ContentLength` will not try to re-compute the value if that header already exists:
https://github.com/rack/rack/blob/6196377654b7ff7ce7abaecea62bb285d77d53aa/lib/rack/content_length.rb#L21

Should Ruby:
- Do a `self.delete 'content-length'` in the inflater?
- Compute the `content-length` itself and update the header? (Hacky example: https://github.com/ruby/ruby/compare/master...jmreid:content-length)





-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request / ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>