CI/D Readiness Review Addendum Runbook Responses
In preparation for taking over the day to day of CI/CD issues, the runbooks should cover alerts that frequently have occurred.
-
CPU use percent is extremely high on shared-runners-manager-4.gitlab.com for the past 2 hours. -
No disk space left on /opt/prometheus/prometheus/data on prometheus-01.nyc1.do.gitlab-runners.gitlab.net: 641.8m% -
No disk space left on /opt/gitlab on runners-cache-5.gitlab.com: 997.3m% -
Runners manager is down on shared-runners-manager-4.gitlab.com:9402 -
CICDTooManyPendingJobsPerNamespace -
CICDTooManyRunningJobsPerNamespaceOnSharedRunnersGitLabOrg -
CICDNamespaceWithConstantNumberOfLongRunningRepeatedJobs -
CICDJobQueueDurationUnderperformant -
CICDTooManyPendingBuildsOnSharedRunnerProject -
CICDTooManyArchivingTraceFailures
Edited by Alex Hanselka