Geo: A rake task to perform post-omnibus Geo update steps in Zero downtime deployments

Problem to solve

When updating Geo in a zero-downtime, a systems administrator has to perform several manual steps per node. This is error prone and we aim to reduce the number of manual steps required to update Geo.

Intended users

  • Systems administrators

Further details

This feature is part of our strategy to improve the user experience of Geo by reducing manual interventions.

Proposal

When updating a Geo deployment using the Zero Downtime instructions a systems administrator needs to manually perform 8 steps (after omnibus-gitlab!3562 (merged) is merged, 4 + 1 steps).

We aim to reduce these steps for both the primary and secondary nodes to a single rake task e.g. gitlab-rake gitlab:geo:update

In omnibus-gitlab#4637 (closed) we've realized we cannot create a single task for this, but we need a before and after. This is the current proposal:

Before After
SKIP_POST_DEPLOYMENT_MIGRATIONS=true sudo gitlab-ctl reconfigure
sudo gitlab-ctl hup unicorn (optional) gitlab-rake gitlab:geo:update
sudo gitlab-ctl hup sidekiq (optional)
update GitLab on other machines in cluster update GitLab on other machines in cluster
sudo gitlab-rake db:migrate gitlab-rake gitlab:geo:post-update
sudo gitlab-rake gitlab:geo:check

Documentation

N/A

Testing

What does success look like, and how can we measure that?

  • The number of manual interventions is reduced by 50%.
  • Geo update procedure is the same on primary and secondaries.
  • A single rake task for Geo updates

What is the type of buyer?

  • Premium
  • Ultimate

Links / references

Edited Sep 25, 2019 by Toon Claes
Assignee Loading
Time tracking Loading