Investigate a failed rollback for the Kubernetes Container Registry deployment
In a recent incident: #661 (closed) the deploy for the gitlab helm chart had failed and attempted a rollback. Despite the note of a successful rollback, we were still left in a degraded state. Utilize this issue to determine what happened:
- What signaled the failed deploy
- Why was the Container Registry still down after the rollback
Target job: https://ops.gitlab.net/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/jobs/885131
Things to look into
- See if we can determine a diff between the stored versions of helm deployments
- In this case,
gitlab.v26andgitlab.v27
- In this case,
- Determine if there was a change which might have impacted the Google Load Balancer associated with the Container Registry service