Git operations become slow since moving to 15.7
Summary
Since we move from 15.6 to 15.7, git operations would become slow and eventually timeout
Steps to reproduce
- upgrade from 15.6 to 15.7 (in our case, we moved from 15.6.6 to 15.7.7 in our self hosted environment)
- observe slow git push and clone after some time (could be hours)
- upgrade to 15.8 and still see the problem
- rollback to 15.6 and problem goes away
Example Project
What is the current bug behavior?
It's very strange that the slowness doesn't appear immediately once upgrading to 15.7. We observed that it took about a few hours (long enough after all background migration jobs have completed) before git operations became slow. Resource consumption looked fine throughout the process but git operation response time would sharply increase to the point where they would time out at some point.
What is the expected correct behavior?
Git operations would take normal time to complete
Relevant logs and/or screenshots
At first we thought it's a memory leak problem but memory consumption by pods looked ok.
Looking at logs, it looks suspicious that as soon as a worker starts, it shuts down:
15:16:49 gitlab-webservice-default-5b548f49b9-48qzd webservice unknown - Worker 18 (PID: 216) booted in 0.13s, phase: 0
15:16:49 gitlab-webservice-default-5b548f49b9-48qzd webservice unknown === puma shutdown: 20:16:49 +0000 ===
15:16:49 gitlab-webservice-default-5b548f49b9-48qzd webservice unknown - Goodbye!
!108112 (merged) change also caught our attention. While it was enabled by default, we observed in logs memory limit exceeded
even before any slowness happened. We tried disabling it via op feature flag. The error message went away but the slowness was still there.
In our self hosted k8s environment, we have enough memory allocated for the webservice deployment and we have tried bumping up # of pods while reducing # of puma workers per pod.
Output of checks
Results of GitLab environment info
self hosted gitlab v15.7.7 and v15.8.3 on k8s clusters
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:check SANITIZE=true
)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true
)(we will only investigate if the tests are passing)