Add monitoring for the Helm charts publishing process
Background
When a GitLab version is released, Helm charts must also be published to charts.gitlab.io so that users who install GitLab via Helm can pick up the new version. The publishing process involves several hops:
- A tag is created in the canonical (or security) Charts repository on GitLab.com
- The tag is mirrored to
dev.gitlab.organd a pipeline runs there - The
release_packageCI job in that pipeline triggers a downstream pipeline in thecharts/charts.gitlab.iorepository on GitLab.com - Once that downstream pipeline succeeds, the new chart version is live and available to users
Problem
Recently, a GitLab version was released but the corresponding Helm charts were not published to charts.gitlab.io. Release Managers (RMs) had no visibility into this failure — it was only discovered when a team member noticed and reported it to us.
The current RM process only checks whether the tag pipeline on dev.gitlab.org was successful. It does not verify the downstream triggered pipeline in charts.gitlab.io.
Reference: gitlab-org/release/docs!1014 (merged), and discussion in the revert MR charts/charts.gitlab.io!189 (comment 3152738040)
Goal
Add monitoring so RMs are notified that publish step has failed
Options
The following options are available. This issue is intended to kick off discussion so we can agree on an implementation plan.
Option A: Automate pipeline status checks with a Slack alert
Add an automated check to the dev.gitlab.org tag pipeline which polls the charts.gitlab.io pipeline status after a release tag is created and fails the tag pipeline if the triggered pipeline fails.
Option B: Add a post-release verification job to the Charts pipeline in Canonical
Add a CI job to the Charts tag pipeline which runs helm show chart --version ... and verifies that the chart has been published to charts.gitlab.io.
This would be similar to the [check-packages-availability](https://gitlab.com/gitlab-org/omnibus-gitlab/-/blob/a072dc9d053443bd88a502210c3213abea4ab785/gitlab-ci-config/dev-gitlab-org.yml#L1849\] CI job in the Omnibus codebase.
Option C: Add a manual checklist step
Warning
We don't want to add more steps to the checklist. So, this is not really an option. I am noting it here, just in case.
Add a step to the RM release checklist to explicitly verify the triggered pipeline in charts.gitlab.io and confirm chart availability via:
helm show chart --version <version> gitlab-repo/gitlabExit Criteria
- Discuss!
- Decide which option to implement / what to do
- Implement!
- Verify with 18.11 monthly release