Skip to content

Deploy workload identity for container-registry

This is a follow-up for the discussion in !2352 (comment 2568695656)

The goal is to switch container-registry auth on SaaS from static credentials to workload identity, as part of https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/issues/22409 Monolith is also going in this direction.

Release plan:

  1. Deploy with caching disabled - container registry deployed but URL caching not enabled in its configuration
  2. Enable caching in dry-run mode - enable URL caching without deploying workload identity.
    • Add urlcaching metrics visualization to the container-registry storage dashboards (#1621 (closed)).
    • Monitor Redis load.
  3. Staging validation - enable workload identity + URL caching for basic smoke testing
  4. Deploy container registry with TTL bugfix
  5. Gradual production rollout - starting with gcny, then zone by zone:
    • Cut Registry backend traffic in HAProxy to 0%
    • Enable workload identity + URL caching in configuration
    • Increase traffic in ~5% intervals while monitoring IAM load, cache hit ratio, and error rates
Edited by Pawel Rozlach