Add timed deployments to AutoDevOps incremental rollouts
Problem to solve
To reduce operational risk when rolling out new code, a best practice is to slowly, incrementally roll out code changes to a fleet, pausing in between certain breakpoints, and optionally allowing for metrics to pause or abort a rollout. Our current Kubernetes canary deployment is implemented only as a definable rollout percentage that can be set on a per- manual run basis, which you can see below.
gitlab-ce#51352 (closed) introduces the ability to schedule/delay job execution. Building on this feature we're able to update the incremental rollouts procedure in AutoDevOps to schedule each increment automatically.
This is a v2 iteration for the existing k8s incremental rollouts feature (above), that exists in production today. What this adds to the existing functionality is the ability to set an automatic timeout whereupon the deployment will continue forward to the next rollout percentage in the deployment phase. For the MVC, this will be a pre-set value of 5m for users choosing timed incremental rollouts; this value is editable in the CI yml after generation by the end user.
The current idea to get this shipped is to provide a generic function, not a specific function for k8s rollout. Generic in sense that this is available for any type of the workflow at Job level, not Environment level, and is also not k8s specific (i.e., implemented inside the AutoDevOps shell script).
We will provide a way to enable the timed incremental rollout in AutoDevOps, in addition to the existing manual one, using the shared
when: in 1hour feature.
Each of these rollout percentages will be implemented in a separate stage, so that if one of the jobs for the percentages is canceled, that all of the progress pauses.
Default AutoDevOps (generated config) wait between steps: 5m - this can be edited in the YML after the fact by the user to change the timeout.
For the Incremental rollout https://docs.gitlab.com/ee/topics/autodevops/#incremental-rollout-to-production, users can set the secret variable
INCREMENTAL_ROLLOUT_ENABLED=1 to choose between Manual-Incremental rollout(10%/25%/50%/100%) or standard deploy(100%).
The configuration is available when setting up a new AutoDevOps project directly:
Links / references
- Alternate: Canary deploys: #1659 (closed)
- Tweet: https://twitter.com/officesunshine/status/821787299084173320
- Pausing and Resuming a Deployment
- Canary deployments