Next steps to use existing canary deployment
In #1924 (closed), we now have a set of canary machines that we should be able to deploy iterations before they hit an RC. What do we need to do to use this?
One complication discussed in Slack is that Sidekiq jobs may need to be versioned in some way, since it's possible for the Rails process to schedule new jobs or change parameters for an existing worker that are not present in older versions.
Here is a strawman proposal for how we might use this canary deployment in practice:
- Use the nightly EE omnibus package
- The package gets deployed on the canary (e.g. via ChatOps). The package can be rolled back if necessary.
- Alerts for the canary get sent out to flag possible issues.
What do we need to do address each point?
UPDATE:
Per discussion below, the following four points need to be addressed to unblock this:
- Make canary.gitlab.com reflect
master
(#2606 (closed)) - Make sure database migrations are truly backwards compatible (https://gitlab.com/gitlab-org/gitlab-ce/issues/36911)
- Make a good mechanism to revert the DB changes (https://gitlab.com/gitlab-org/gitlab-ce/issues/36912)
- Figure out a plan to handle Sidekiq versioning issues for canary deployment (#2607 (closed))
/cc: @marin, @DouweM, @rymai, @smcgivern, @pcarranza, @ayufan
Edited by Ernst van Nierop