Release Coordinated Pipeline checks the status of our Helm Chart, not CNG
Problem Statement
Auto-Deploy doesn't need to know the status of our helm chart builds. We build our container images inside of CNG and thus those are the important pipelines we should monitor. Luckily we've got a check built inside of Deployer, but this would be better served to occur as part of the Release Coordinated Pipeline. This could lead to a situation where the helm chart is done building, omnibus is done building, therefore the Release coordinated pipeline begins it's job, triggers a deployment, and then we get caught up on the Kubernetes trigger because the CNG may not have succeeded. This is problematic because noticing this happening will be well after we've already started and completed a deploy on Gitaly leading to an inconsistency until we can resolve the problem with building images.
Solution
Move the CNG check to Release Coordinated Pipeline, remove it from Deployer
References:
- Deployer CNG Check: https://ops.gitlab.net/gitlab-com/gl-infra/deploy-tooling/-/blob/master/common_tasks/cng_verify.yml
- Release Tools Coordinate Pipeline checking helm instead of CNG: https://gitlab.com/gitlab-org/release-tools/-/blob/master/lib/tasks/auto_deploy.rake#L119-135
Example Jobs:
Example Event
- Deployer CNG Check: https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/7600832
- Release Tools Coordinated Pipeline checking helm instead of CNG: https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/7599431
Another Example Event
CNG was in state failed, but our helm chart build succeeded. This led to a deploy to reach canary, but we weren't able to deploy to Kubernetes:
Implemented
- The release-tools coordinator pipeline now checks the CNG pipeline at the start in the
wait:cng
job. - The deployer pipeline also checks the CNG pipeline at the start in the
<env>-verify-cng
jobs (gstg-verify-cng
,gstg-cny-verify-cng
, etc).