Define a process to account for broken stable branches
Context
Delivery is working to extend the GitLab maintenance policy to support bug fixes to the two previous monthly releases in addition to the current stable release. Part of this work will open up the stable branches to allow developers to merge bug fixes directly into the stable branch rather than rely on the pick labels.
Stable branches represent the source of GitLab releases, because of this, it's vital to guarantee their readiness and availability. There are efforts planned to minimize the failures on these branches (such as #2725 (closed) and #2654 (closed)), however, when failures occur they should be promptly addressed to prevent blocking any release schedule.
Similar to the master-broken process, a broken stable-branch processes should account for:
- Failures introduced by merge requests.
- Flaky failures present on stable branches.
- Failures on stable branches blocking release activities.
The purpose of this issue is to agree on the process for these failures.
#2782 (closed).
Failures introduced by merge requests- Failures on stable branches should be communicated:
- A Slack message on the
#releases
channel should be automatically posted - The merge request author should be notified about this failure.
- A Slack message on the
- An issue on the
release/tasks
repository should be opened. - The issue should be assigned to the merge request author and the release managers.
- The responsibility of the merge request author is to act as a resolution DRI and work on a fix for the failure, alternatively the release managers can opt-in for reverting the merge request.
Flaky failures.
As part of the master-broken process, flaky failures are regularly fixed on the GitLab default branch. To ensure these failures don't propagate to stable branches, the responsibilities of the resolution DRI could be expanded to backport the fix to the respective stable branches. A step could be added to the respective templates to account for this.
Documentation / Templates to update
-
Flaky failure template - gitlab-org/gitlab!109134 (merged) -
Resolution of master-broken regarding stable branches https://about.gitlab.com/handbook/engineering/workflow/#resolution-of-broken-master - gitlab-com/www-gitlab-com!118053 (merged)
#2705 (closed)
Failures on stable branches prevent release activities.When a failure on a stable branch blocks a monthly, patch or a security release, release managers should treat these failures as lightweight incidents and:
- Open up a release/task issue, if none is open yet.
- Request dev-escalation assistance to identify the root cause and fix the failure.