Release-tools has no way of knowing it's consistently pulling old commits
Problem statement
Recently @twk3 approached Release Managers noting that no omnibus changes in the last 3 weeks have been deployed via auto-deploy!
- This was caused by red pipelines on the security repository for omnibus - but we are green on local testing, and CI in both .com and dev instances
- Release tools was still updating components and thus the
ref
being deployed for omnibus seemingly always changed
Because of the above, we have a problem that flew under the radar for multiple weeks without notice. Nothing in release-tools alerts anyone that we've failed to find a green pipeline that is relatively recent. This means that we suffer an issue where if a pipeline is never green, auto-deploy may never upgrade a component and we may not notice.
Proposal
We must find a method for which when red pipelines are found and rely on an older one, to notify us if we exceed a certain threshold. Perhaps if a pipeline not green found in the last 24 hours, should send a message to slack to notify us to start an investigation?
When creating a new auto deploy branch, send a notification to #f_upcoming_release
if commits that were committed 7 or more days ago have not been included.
- In passing_build.rb or auto_deploy_branch_service.rb, for each project for which we create an auto deploy branch,
- Find the oldest commit that is not going to be included in the auto deploy branch.
- If
Time.current - excluded_commit.committed_date > 7.days
, send a notification to the#f_upcoming_release
channel in Slack.
Follow-up enhancements
Exclude change lock periods when calculating the number of days that a commit has been waiting to be included in an auto deploy package.
- Change lock data is available at https://gitlab.com/gitlab-com/gl-infra/change-lock/-/blob/master/config/changelock.yml.
Implemented
The "New auto deploy branch" Slack notification in the #releases
channel has been modified to mention release managers if one or more of the branches are far behind the default branch.
This change was behind a feature flag called notify_branch_too_far_behind
(https://ops.gitlab.net/gitlab-org/release/tools/-/feature_flags/223/edit) which was removed by gitlab-org/release-tools!1926 (merged).