Sign in or sign up before continuing. Don't have an account yet? Register now to get started.
Register now
Zero Downtime Upgrade (ZDU) support for GitLab Secrets Manager
<!--IssueSummary start--> <details> <summary> Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards. </summary> - [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=595721) </details> <!--IssueSummary end--> ## Problem to solve Self-managed GitLab customers deploying GitLab Secrets Manager on Kubernetes expect zero-downtime upgrades (ZDU) as a standard operational capability. Currently, OpenBao (the secrets backend) requires a `Recreate` deployment strategy, which introduces brief service interruptions during upgrades. This impacts: - **Production availability**: Secrets retrieval is unavailable during the upgrade window, potentially blocking CI/CD pipelines and application deployments - **Operational complexity**: Administrators must schedule maintenance windows for routine upgrades Supporting ZDU would align GitLab Secrets Manager with the zero-downtime upgrade expectations already established for other GitLab components on Kubernetes. ## Options considered ### Option 1: Helm hook-based upgrade strategy (blocked) Investigated in [openbao#13](https://gitlab.com/gitlab-org/cloud-native/charts/openbao/-/work_items/13). This approach proposed using Helm hooks to orchestrate pod replacement order during upgrades. However, the investigation concluded that **hooks are not viable** because: - Hooks don't work with `helm template | kubectl apply` workflows - Many GitOps tools (ArgoCD, Flux) don't support Helm hooks natively - This would limit deployment flexibility for self-managed customers **Status**: Closed - not viable at the chart level. ### Option 2: Version-gated leader election in OpenBao core (in progress) Tracked in [openbao#34](https://gitlab.com/gitlab-org/cloud-native/charts/openbao/-/work_items/34). This approach implements version-aware leader election directly in OpenBao, ensuring older nodes cannot claim leadership after a newer node has been leader. Key benefits: - Works with any deployment workflow (Helm, ArgoCD, Flux, raw kubectl) - No hooks or special orchestration required - Chart can use standard `RollingUpdate` strategy - Backward compatible with existing deployments **Status**: In progress - requires upstream OpenBao contribution. ## Documentation requirements Once ZDU is supported, documentation updates will be needed for: - OpenBao Helm chart upgrade procedures - GitLab Secrets Manager admin guide - Known limitations and prerequisites ## Related links - https://openbao.org/docs/concepts/ha/ - https://openbao.org/docs/upgrading/ha-upgrade/ - https://docs.gitlab.com/charts/installation/upgrade/#upgrade-with-zero-downtime - https://docs.gitlab.com/releases/18/gitlab-18-9-released/#zero-downtime-upgrades-now-supported-for-cloud-native-hybrid-deployments
issue