Draft: Support passing empty and binary HTTP header values

Relates to https://gitlab.com/gitlab-com/ops-sub-department/section-ops-request-for-help/-/issues/451.

We don't validate the headers that agent returns because stdlib's HTTP server:

Validation using protovalidate would be nice, but we cannot do it yet:

Relevant info from the issue is below.


Looking at the support case:

 "message": "GitLab Agent Server: HTTP-\u003egRPC: failed to read gRPC response: rpc error: code = InvalidArgument desc = invalid HttpRequest.Header: embedded message failed validation | caused by: invalid HttpRequest_Header.Request: embedded message failed validation | caused by: invalid HttpRequest.Header[Wl-Proxy-Client-Cert]: embedded message failed validation | caused by: invalid HeaderValues.Value[0]: value length must be at least 1 bytes",
 "reason": "InternalError",
 "code": 502

This is kas producing an error while reading response from agentk for a proxied request to the Kubernetes API. The error is produced by agentk while validating what kas sent to it. And kas somehow sent an empty value for a header. It turns out empty values for headers are allowed. This is a bug that we need to fix.

func TestName(t *testing.T) {
	x := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		t.Log(r.Header)
		w.Header().Set("abc", "")      // empty string
		w.Header()["Xyz"] = []string{} // empty slice
	}))
	defer x.Close()

	c := http.Client{}

	r, err := http.NewRequest(http.MethodGet, x.URL, nil)
	require.NoError(t, err)

	r.Header.Set("Empty", "") // empty string
	r.Header["No-Val"] = []string{}

	resp, err := c.Do(r)
	require.NoError(t, err)

	t.Log(resp.Header)
}

prints

=== RUN   TestName
    defaulting_test.go:32: map[Accept-Encoding:[gzip] Empty:[] User-Agent:[Go-http-client/1.1]]
    defaulting_test.go:49: map[Abc:[] Content-Length:[0] Date:[Sat, 12 Oct 2024 06:38:51 GMT]]
--- PASS: TestName (0.00s)
PASS

The header must be added by the reverse proxy that they use in front of kas. Looks like Wl-Proxy-Client-Cert belongs to WebLogic.


error while marshaling: string field contains invalid UTF-8

Turns out HTTP/1.1 allows some binary data in header values:

  field-value    = *field-content
  field-content  = field-vchar
                   [ 1*( SP / HTAB / field-vchar ) field-vchar ]
  field-vchar    = VCHAR / obs-text
  obs-text       = %x80-FF

Where VCHAR is:

VCHAR          =  %x21-7E
               ; visible (printing) characters

Header name is just a token, which is defined here:

token          = 1*tchar

tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
                 / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
                 / DIGIT / ALPHA
                 ; any VCHAR, except delimiters

DIGIT          =  %x30-39
                  ; 0-9
ALPHA          =  %x41-5A / %x61-7A   ; A-Z / a-z

Interestingly, stdlib uses different approaches for HTTP/1.1 and for HTTP/2:

First two are used initially when request is read using textproto.Reader.ReadMIMEHeader(). But then the caller uses the second two to validate what was read.


I think our validation must be loosened to match stdlib and RFCs.

Edited by Mikhail Mazurskiy

Merge request reports

Loading