Revert "Merge branch 'wc-gitaly-keepalive-limit' into 'master'"
What does this MR do and why?
This reverts !73302 (merged). This setting causes long-running RPCs to shut down and causes the client to receive GOAWAY
messages from the server. As explained in https://github.com/grpc/grpc/issues/25713, configuring Gitaly keepalive settings is dangerous: if the client sends too many, then the server will abruptly shut down the connection, causing RPCs to fail.
At the moment, it doesn't appear grpc-go has a way to configure or even disable this shutdown mechanism (I submitted https://github.com/grpc/grpc-go/pull/5162). For now, we should revert this change because it does more harm than good.
Go 1.13 should have enabled 15-second TCP keepalives by default; I'm not sure yet why this isn't working, or whether gRPC is fiddling with this as well.
Relates to #350580 (closed)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Merge request reports
Activity
changed milestone to %14.8
added Pick into 14.6 Pick into 14.7 labels
assigned to @wchandler
mentioned in issue #350580 (closed)
- A deleted user
added backend label
2 Warnings You've made some app changes, but didn't add any tests.
That's OK as long as you're refactoring existing code,
but please consider adding any of the ~"type::tooling", ~"tooling::pipelines", ~"tooling::workflow", documentation, QA labels.Please add a merge request type to this merge request. 1 Message CHANGELOG missing: If you want to create a changelog entry for GitLab FOSS, add the
Changelog
trailer to the commit message you want to add to the changelog.If you want to create a changelog entry for GitLab EE, also add the
EE: true
trailer to your commit message.If this merge request doesn't need a CHANGELOG entry, feel free to ignore this message.
Reviewer roulette
Changes that require review have been detected!
Please refer to the table below for assigning reviewers and maintainers suggested by Danger in the specified category:
Category Reviewer Maintainer backend Tiger Watson ( @tigerwnz
) (UTC+11, 19 hours ahead of@stanhu
)Aleksei Lipniagov ( @alipniagov
) (UTC+3, 11 hours ahead of@stanhu
)To spread load more evenly across eligible reviewers, Danger has picked a candidate for each review slot, based on their timezone. Feel free to override these selections if you think someone else would be better-suited or use the GitLab Review Workload Dashboard to find other available reviewers.
To read more on how to use the reviewer roulette, please take a look at the Engineering workflow and code review guidelines. Please consider assigning a reviewer or maintainer who is a domain expert in the area of the merge request.
Once you've decided who will review this merge request, assign them as a reviewer! Danger does not automatically notify them for you.
If needed, you can retry the
danger-review
job that generated this comment.Generated by
Dangerassigned to @stanhu and unassigned @wchandler
requested review from @wchandler
added Pick into 14.5 label
- Resolved by Stan Hu
Thanks, @stanhu, nice catch.
Once we revert this we will see failures on long-running RPCs in Gitaly Cluster again, as in https://gitlab.com/gitlab-org/quality/gitlab-environment-toolkit/-/issues/290. This can be worked around with HAProxy by setting a 6 hour timeouts, but ELBs have a hard limit of 350 seconds.
What about having the server send the keepalive, instead. The default interval is 2 hours, we could bump that down to 5 minutes:
diff --git a/internal/gitaly/server/server.go b/internal/gitaly/server/server.go index 06a665dcf..b808aa76c 100644 --- a/internal/gitaly/server/server.go +++ b/internal/gitaly/server/server.go @@ -147,6 +147,9 @@ func New( MinTime: 20 * time.Second, PermitWithoutStream: true, }), + grpc.KeepaliveParams(keepalive.ServerParameters{ + Time: 5 * time.Minute, + }), } return grpc.NewServer(opts...), nil
I haven't tested this yet, but does the idea seem sound?
removed review request for @wchandler
@wchandler
, thanks for approving this merge request.This is the first time the merge request is approved. To ensure full test coverage, a new pipeline has been started.
For more info, please refer to the following links:
requested review from @toon
removed review request for @toon
mentioned in issue #349425 (closed)
mentioned in issue gitaly#4007 (closed)
mentioned in commit 0693fbff
added workflowstaging-canary label
added workflowstaging label and removed workflowstaging-canary label
added workflowcanary label and removed workflowstaging label
added workflowproduction label and removed workflowcanary label
mentioned in merge request gitaly!4278 (merged)
removed Pick into 14.5 Pick into 14.6 labels
Marking this severity1 since large imports and other long-running RPCs no longer work (relates to https://gitlab.com/gitlab-org/gitlab/-/issues/351340).
mentioned in issue gitlab-org/release/tasks#3423 (closed)
picked the changes into the branch
14-7-stable-ee-patch-2
with commit 24f4aecfAutomatically picked into !79966 (merged), will merge into
14-7-stable-ee
ready for14.7.2-ee
.removed Pick into 14.7 label
mentioned in commit 24f4aecf
mentioned in merge request !79966 (merged)
added releasedcandidate label
added releasedpublished label and removed releasedcandidate label
mentioned in issue #359120