Geo: A rake task to perform post-omnibus Geo update steps in Zero downtime deployments

Problem to solve

When updating Geo in a zero-downtime, a systems administrator has to perform several manual steps per node. This is error prone and we aim to reduce the number of manual steps required to update Geo.

Intended users

Further details

This feature is part of our strategy to improve the user experience of Geo by reducing manual interventions.

Proposal

When updating a Geo deployment using the Zero Downtime instructions a systems administrator needs to manually perform 8 steps (after omnibus-gitlab!3562 (merged) is merged, 4 + 1 steps).

We aim to reduce these steps for both the primary and secondary nodes to a single rake task e.g. gitlab-rake gitlab:geo:update

In omnibus-gitlab#4637 (closed) we've realized we cannot create a single task for this, but we need a before and after. This is the current proposal:

Before After
SKIP_POST_DEPLOYMENT_MIGRATIONS=true sudo gitlab-ctl reconfigure
sudo gitlab-ctl hup unicorn (optional) gitlab-rake gitlab:geo:update
sudo gitlab-ctl hup sidekiq (optional)
update GitLab on other machines in cluster update GitLab on other machines in cluster
sudo gitlab-rake db:migrate gitlab-rake gitlab:geo:post-update
sudo gitlab-rake gitlab:geo:check

Documentation

N/A

Testing

What does success look like, and how can we measure that?

  • The number of manual interventions is reduced by 50%.
  • Geo update procedure is the same on primary and secondaries.
  • A single rake task for Geo updates

What is the type of buyer?

  • Premium
  • Ultimate

Links / references

Edited by Toon Claes