Handle redirects on artifact uploads
What does this MR do?
Fixes artifacts uploading when 307 response is sent back by the server.
Why was this MR needed?
We've found this problem while working on https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14874.
The problem is caused by this part of Go's http.Client
implementation:
case 307, 308:
redirectMethod = reqMethod
shouldRedirect = true
includeBody = true
// Treat 307 and 308 specially, since they're new in
// Go 1.8, and they also require re-sending the request body.
if resp.Header.Get("Location") == "" {
// 308s have been observed in the wild being served
// without Location headers. Since Go 1.7 and earlier
// didn't follow these codes, just stop here instead
// of returning an error.
// See Issue 17773.
shouldRedirect = false
break
}
if ireq.GetBody == nil && ireq.outgoingLength() != 0 {
// We had a request body, and 307/308 require
// re-sending it, but GetBody is not defined. So just
// return this response to the user instead of an
// error, like we did in Go 1.7 and earlier.
shouldRedirect = false
}
Initially there were three "redirecting" responses defined by RFC 2616: 301
, 302
and 303
. And in that RFC it was defined that automatic redirection should be allowed for GET
and HEAD
requests.
This was changed later by RFC 7231, but not all clients have implemented the change (including Go). When 307
and 308
were introduced, the specification required that body and method used by the repeated requests are not changed.
As we can see here, redirection is disabled (and the 307
/308
response is returned directly to the caller) if:
-
Location
header is not set (which is clear, as if there is no location provided, client can't know where to send the repeated request), -
GetBody
function is empty andoutgoingLength()
returns more than0
.
The second case is what causes problem in the Runner. As we're constructing the request with http.NewRequestWithContext()
, Body
(the statical value of the request struct) is set explicitly. GetBody()
function is defined only in some cases:
if body != nil {
switch v := body.(type) {
case *bytes.Buffer:
req.ContentLength = int64(v.Len())
buf := v.Bytes()
req.GetBody = func() (io.ReadCloser, error) {
r := bytes.NewReader(buf)
return io.NopCloser(r), nil
}
case *bytes.Reader:
req.ContentLength = int64(v.Len())
snapshot := *v
req.GetBody = func() (io.ReadCloser, error) {
r := snapshot
return io.NopCloser(&r), nil
}
case *strings.Reader:
req.ContentLength = int64(v.Len())
snapshot := *v
req.GetBody = func() (io.ReadCloser, error) {
r := snapshot
return io.NopCloser(&r), nil
}
default:
// This is where we'd set it to -1 (at least
// if body != NoBody) to mean unknown, but
// that broke people during the Go 1.8 testing
// period. People depend on it being 0 I
// guess. Maybe retry later. See Issue 18117.
}
// For client requests, Request.ContentLength of 0
// means either actually 0, or unknown. The only way
// to explicitly say that the ContentLength is zero is
// to set the Body to nil. But turns out too much code
// depends on NewRequest returning a non-nil Body,
// so we use a well-known ReadCloser variable instead
// and have the http package also treat that sentinel
// variable to mean explicitly zero.
if req.GetBody != nil && req.ContentLength == 0 {
req.Body = NoBody
req.GetBody = func() (io.ReadCloser, error) { return NoBody, nil }
}
}
In our case - where for artifacts body
value contains *io.PipeReader
- it's not created. And then when artifacts uploading request receives 307
- it's not redirected but returned directly to the uploading command context. As 307
is not treated in gitlab.go
as successful (as expected) - the operation fails. Finally it ends like here:
Uploading artifacts...
date: found 1 matching files and directories
WARNING: Uploading artifacts as "archive" to coordinator... 307 Temporary Redirect id=40276855 responseStatus=307 Temporary Redirect status=307 token=uWL4ywrj
WARNING: Retrying... context=artifacts-uploader error=invalid argument
WARNING: Uploading artifacts as "archive" to coordinator... 307 Temporary Redirect id=40276855 responseStatus=307 Temporary Redirect status=307 token=uWL4ywrj
WARNING: Retrying... context=artifacts-uploader error=invalid argument
WARNING: Uploading artifacts as "archive" to coordinator... 307 Temporary Redirect id=40276855 responseStatus=307 Temporary Redirect status=307 token=uWL4ywrj
FATAL: invalid argument
The initial idea to fix that was to make the http.Client
internal handler requirements fulfilled. But as mentioned by @ajwalker in the discussion bellow, this could cause problems - especially with the artifacts uploading - if the receiver would consume the body while responding with 307/308. The HTTP specification doesn't define whether it's allowed or not so we can't assume that all proxies will ignore the body.
With that in mind, the proposed approach is to handle the redirection on the artifact uploader command. We already have there a mechanism for redirects (which also restarts archiving the files to a pipe stream) so it's just a matter of returning a new status and a new location that should be used.
What's the best way to test this MR?
Integration test is added to the MR. Other existing tests should ensure that this change doesn't break anything.