(Size L) Cells 1.5: Advanced search automate upgrades

Problem to solve

Related discussion in Dedicated: https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/5857#note_2039356230

Elasticsearch for GitLab.com is hosted on ElasticCloud. Deployments are handled manually using change requests and follow this runbook: https://gitlab.com/gitlab-com/runbooks/-/tree/master/docs/elastic#upgrade-checklist

When there are multiple cells (and multiple search backends), we cannot sustain a manual process for upgrades.

Proposal

Start working on an automation strategy for upgrades. We can start with automation through rake tasks (and eventually expose an API, see #323941).

Two rake tasks should be created:

  1. prepare for upgrade
  • pause indexing
  • validate the queues have drained
  • give the OK
  1. validate upgrade successful
  • validate search works
  • unpause indexing
  • wait for queues to drain
  • validate indexing works
  • give the OK

Helpful links

Edited by Terri Chu