Deployment Blockers - Week: 2025-09-08-2025-09-14

Overview

Start date End date Production deployments blocked for # of blockers
2025-09-08 2025-09-14 37.0 43

Blocker types overview

Blocker Type Time blocked on gstg Time blocked on gprd
Change-request 0.5 0.5
RootCauseMalicious-Traffic 1.0 1.0
RootCauseDB-Migration 7.0 7.0
RootCauseFlaky-Test 12.5 13.5
RootCauseSoftware-Change 7.0 7.0
RootCauseSaturation 4.0 4.0
RootCauseExternal-Dependency 0.5 0.5
~"RootCause::Security" 0.5 0.5
RootCauseIndeterminate 3.0 3.0
Total 36.0 37.0

Weekly overview

Issue Blocker type gstg gprd
2025-09-12: Smoke tests are failing on staging-... (gitlab-com/gl-infra/production#20531 - closed) RootCauseFlaky-Test 7.0 8.0
2025-09-09: rails_request error rate violating ... (gitlab-com/gl-infra/production#20499 - closed) RootCauseSoftware-Change 7.0 7.0
2025-09-10: GitLab 500 errors and slowness (gitlab-com/gl-infra/production#20510 - closed) RootCauseSaturation 4.0 4.0
2025-09-11: Production post deployment migratio... (gitlab-com/gl-infra/production#20518 - closed) RootCauseDB-Migration 3.0 3.0
2025-09-12: K8s deployment failing on staging-c... (gitlab-com/gl-infra/production#20530 - closed) RootCauseIndeterminate 3.0 3.0
2025-09-08: Production Post Deployment Migratio... (gitlab-com/gl-infra/production#20494 - closed) RootCauseDB-Migration 2.0 2.0
2025-09-11: staging PDM failed with CheckViolat... (gitlab-com/gl-infra/production#20515 - closed) RootCauseDB-Migration 2.0 2.0
2025-09-12: Remove prefix-based routing for git... (gitlab-com/gl-infra/production#20481 - closed) Change-request 0.5 0.5
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/20493+ RootCauseMalicious-Traffic 0.5 0.5
QA failure: qa/specs/features/browser_ui/3_crea... (#21236 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: qa/specs/features/browser_ui/10_sof... (#21249 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: qa/specs/features/browser_ui/8_moni... (#21252 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: qa/specs/features/browser_ui/3_crea... (#21255 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: qa/specs/features/browser_ui/4_veri... (#21260 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: qa/specs/features/browser_ui/4_veri... (#21262 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: (#21264 - closed) RootCauseFlaky-Test 0.5 0.5
2025-09-11: New traffic causing poor canary per... (gitlab-com/gl-infra/production#20516 - closed) RootCauseMalicious-Traffic 0.5 0.5
QA failure: qa/specs/features/browser_ui/3_crea... (#21268 - closed) RootCauseFlaky-Test 0.5 0.5
2025-09-11: PubSub messages queuing in pubsub-r... (gitlab-com/gl-infra/production#20519 - closed) RootCauseExternal-Dependency 0.5 0.5
QA failure: qa/specs/features/browser_ui/9_tena... (#21283 - closed) RootCauseFlaky-Test 0.5 0.5
QA failure: qa/specs/features/browser_ui/8_moni... (#21285 - closed) RootCauseFlaky-Test 0.5 0.5
https://gitlab.com/gitlab-com/gl-infra/production/-/issues/20525+ ~"RootCause::Security" 0.5 0.5
QA failures on gstg-cny (#21303 - closed) RootCauseFlaky-Test 0.5 0.5
QA failures on gstg-cny (#21243 - closed) 0.0 0.0
QA failures on gstg-cny (#21245 - closed) 0.0 0.0
Tuesday 2025-09-09 10:31 UTC - `gitlab-org/gitl... (#21246 - closed) 0.0 0.0
QA failure: qa/specs/features/browser_ui/10_sof... (#21247 - closed) 0.0 0.0
QA failure: qa/specs/features/browser_ui/10_sof... (#21248 - closed) 0.0 0.0
https://gitlab.com/gitlab-org/release/tasks/-/issues/21257+ 0.0 0.0
Wednesday 2025-09-10 22:19 UTC - `gitlab-org/gi... (#21258 - closed) 0.0 0.0
Wednesday 2025-09-10 23:45 UTC - `gitlab-org/gi... (#21261) 0.0 0.0
Thursday 2025-09-11 22:08 UTC - `gitlab-org/git... (#21275 - closed) 0.0 0.0
Thursday 2025-09-11 22:46 UTC - `gitlab-org/git... (#21276 - closed) 0.0 0.0
https://gitlab.com/gitlab-org/release/tasks/-/issues/21278+ 0.0 0.0
https://gitlab.com/gitlab-org/release/tasks/-/issues/21279+ 0.0 0.0
https://gitlab.com/gitlab-org/release/tasks/-/issues/21284+ 0.0 0.0
QA failures on gstg-cny (#21287 - closed) 0.0 0.0
QA failures on gstg-cny (#21288 - closed) 0.0 0.0
QA failures on gstg-cny (#21295 - closed) 0.0 0.0
QA failures on gstg-cny (#21298 - closed) 0.0 0.0
QA failures on gstg-cny (#21301 - closed) 0.0 0.0
QA failures on gstg-cny (#21302 - closed) 0.0 0.0
https://gitlab.com/gitlab-org/release/tasks/-/issues/21313+ 0.0 0.0
Total 36.0 37.0

Additional incidents

Below is a list of production incidents created last week.

Click to expand
Issue
2025-09-14: Large Increase in RPS for `web-pages` (gitlab-com/gl-infra/production#20536 - closed)
2025-09-14: e2e Runners Offline Again (gitlab-com/gl-infra/production#20535 - closed)
2025-09-14: Sidekiq queueing SLI apdex SLO viol... (gitlab-com/gl-infra/production#20534 - closed)
2025-09-13: Disk space utilization on gitaly no... (gitlab-com/gl-infra/production#20533 - closed)
2025-09-13: rails_request error rate in ai-assi... (gitlab-com/gl-infra/production#20532 - closed)
2025-09-12: Gitaly CNY issues (gitlab-com/gl-infra/production#20528 - closed)
2025-09-12: Sentry Clickhouse disks unexpectedl... (gitlab-com/gl-infra/production#20524 - closed)
2025-09-11: Disk space utilization in gitaly no... (gitlab-com/gl-infra/production#20520 - closed)
2025-09-11: Sidekiq queueing SLO violation on c... (gitlab-com/gl-infra/production#20514 - closed)
2025-09-11: Alertmanager failing to send alerts... (gitlab-com/gl-infra/production#20513 - closed)
2025-09-10: Error rate SLO violation for anthro... (gitlab-com/gl-infra/production#20511 - closed)
2025-09-10: Inference errors for anthropic mode... (gitlab-com/gl-infra/production#20507 - closed)
2025-09-10: Apdex SLO violation for workhorse w... (gitlab-com/gl-infra/production#20504 - closed)
2025-09-09: Apdex SLO violation for workhorse w... (gitlab-com/gl-infra/production#20502 - closed)
2025-09-09: Slow load times for GitLab.com (gitlab-com/gl-infra/production#20501 - closed)
2025-09-08: Long-running transaction on patroni... (gitlab-com/gl-infra/production#20496 - closed)
2025-09-08: gitaly server is slow when accessin... (gitlab-com/gl-infra/production#20495)
2025-09-08: The rails_request SLI of the web se... (gitlab-com/gl-infra/production#20492 - closed)

Instructions

  • Review the Additional incidents list and add the Deploys-blocked and RootCause labels if required.
  • Retrigger this pipeline job to update this report and metrics with the updated values. Consider this will discard any manual changes added.
  • Update the "weekly overview" table of this issue to also include:
    • any blocking CRs
  • Update the Deployments metric review epic.
    • Add a new row to the Overview section: Copy and paste the information from the Overview section in this issue and link to this issue in the Breakdown of blockers column.
    • Update the Graph: Update the data on the spreadsheet and then update the graph on the Deployments metric review epic.

📣 The deployment blockers per week can now be visualized in the Deployment Blockers Dashboard.

Edited by Dat Tang