CI/D Readiness Review Addendum Runbook Responses

In preparation for taking over the day to day of CI/CD issues, the runbooks should cover alerts that frequently have occurred.

  • CPU use percent is extremely high on shared-runners-manager-4.gitlab.com for the past 2 hours.
  • No disk space left on /opt/prometheus/prometheus/data on prometheus-01.nyc1.do.gitlab-runners.gitlab.net: 641.8m%
  • No disk space left on /opt/gitlab on runners-cache-5.gitlab.com: 997.3m%
  • Runners manager is down on shared-runners-manager-4.gitlab.com:9402
  • CICDTooManyPendingJobsPerNamespace
  • CICDTooManyRunningJobsPerNamespaceOnSharedRunnersGitLabOrg
  • CICDNamespaceWithConstantNumberOfLongRunningRepeatedJobs
  • CICDJobQueueDurationUnderperformant
  • CICDTooManyPendingBuildsOnSharedRunnerProject
  • CICDTooManyArchivingTraceFailures
Edited Mar 21, 2019 by Alex Hanselka
Assignee Loading
Time tracking Loading