Recent broken master incidents blocking our ability to auto-deploy
Since Friday last week 20:21 UTC, the auto-deploy (build and deploy) has been getting blocked due to the broken master incidents.
Here is the compilation of the master-broken incidents that halted our ability to build packages and deploy with the timeline (from RMs' POV)
Friday (2025-10-17):
- This graph shows the increase in the build pressure on Friday until Monday.
- Based on this graph, looks like there were 2 broken-master incidents that happened during that day.
- The (1st) one (I am still trying to find) was fixed late EMEA.
- The second incident (2nd) starts late AMER and continues until Monday.
- It is caused by a Rubocop failure.
- Effects on the auto-deploy: At 5:49 PM, our AMER RM noticed that the auto-deploy pressure is not decreasing even if we are deploying continuously. (Internal Slack thread).
- Turns out, that alert pertains to the unpackaged commits (not undeployed commits); the build pressure is increasing due to a broken master.
- The result: The packages being built have no changes from the GitLab rails code, only with the CNG and Omnibus. This is an example package:
18.6.202510171351 - The
rubocopfailure broken master originates from this MR: gitlab-org/gitlab!205020 (merged). - It was later fixed in this MR: gitlab-org/gitlab!209462 (merged).
Monday (2025-10-20)
- This graph shows the long-running master broken incident since Friday
- Fortunately, there were some packages built on Friday (in between the two master-broken incidents), so we were able to deploy some changes.
- Another master broken incident (3rd) started on Sunday due to a failing rspec.
- This coincides with the ongoing (2nd) master-broken incident of Friday
- The fix for this (3rd) was merged on
2025-10-20 05:23
- The fix for the second master-broken incident on Friday was finally merged on
2025-10-20 13:26 - Noting that this same day, Slack was down, it affected us from noticing the issue.
- Another master-broken incident (4th) started at
2025-10-20 05:39, this is due to a failing rspec- The fix for this (4th) was merged at
2025-10-21 02:18 - Based on the fix MR description, this spec was recently removed from fast-quarantine
- The fix for this (4th) was merged at
Tuesday (2025-10-21)
- This graph shows the build pressure on 2025-10-21. It was affected by the (4th) incident from the late 2025-10-20 and another one after that (5th)
- The (5th) master-broken incident was due to the failing Rubocop job.
- The fix is in gitlab-org/gitlab!209587 (merged). It was merged on
2025-10-21 07:57
- The fix is in gitlab-org/gitlab!209587 (merged). It was merged on
Notes:
-
We currently have a
ReleaseManagementNumberOfUnpackagedCommitsmetric and alert setup for the unpackaged commits (build pressure). There is also anotherReleaseManagementNumberOfUndeployedCommitsmetric and alert setup for the undeployed commits (deploy pressure). The threshold for both alerts is 100. -
Graph of the
ReleaseManagementNumberOfUnpackagedCommitsalert (Source) -
These master-broken incidents blocked our tooling from packaging the already merged commits since our tooling requires a green master when building a package.
-
These also blocked us from merging the security MRs to be included in the patch release last Wednesday
- Fortunately, we are able to merge and deploy them in time, and the patch release was not delayed.
- That patch release contains severity2 security fixes which cannot be delayed.



