Follow-up from "add flag to parameterize zip http client timeout"
The following discussion from !710 (merged) should be addressed:
-
@jaime started a discussion: I found a reference in the original MR !333 (diffs) but I can't remember why we set it to 30m.
The client timeout includes dialling the connection all the way to reading the entire response body
We already have some metrics on these connections so we should be able to get some data querying the
gitlab_pages_httprange_requests_duration
histogram. I took a quick look at https://prometheus.io/docs/practices/histograms/ and if I'm reading this prometheus graph correctly, only 10-11% of requests fall on the 10s request duration bucket. Perhaps we could add more values the buckets for this histogram to get a clearer picture but that deserves its own issue.For now, it seems that this timeout is quite high. I would suggest proceeding with this change as it is as we wouldn't change any current behavior.We could then change the value in pre/staging and run some tests. We can also validate in production more easily once this flag is in place. Then we can lower the default when we find a value we're more confident with.