Provision new Sidekiq shard's Redis instance in gstg

This will be done using the gitlab-redis cookbook.

The finalised name is redis-sidekiq-catchall-a based on discussions below.

The proposed name for the new instance is redis-sidekiq-shard-a

sidekiq:
      routingRules:
        - ["worker_name=AuthorizedProjectUpdate::UserRefreshFromReplicaWorker,AuthorizedProjectUpdate::UserRefreshWithLowUrgencyWorker", "quarantine"] # move this to the quarantine shard
        - ["worker_name=AuthorizedProjectsWorker", "urgent_authorized_projects"] # urgent-authorized-projects
        - ["resource_boundary=cpu&urgency=high", "urgent_cpu_bound"] # urgent-cpu-bound
        - ["resource_boundary=memory", "memory_bound"] # memory-bound
        - ["feature_category=global_search&urgency=throttled", "elasticsearch"] # elasticsearch
        - ["resource_boundary!=cpu&urgency=high", "urgent_other"] # urgent-other
        - ["resource_boundary=cpu&urgency=default,low", "low_urgency_cpu_bound"] # low-urgency-cpu-bound
        - ["feature_category=database&urgency=throttled", "database_throttled"] # database-throttled
        - ["feature_category=gitaly&urgency=throttled", "gitaly_throttled"] # gitaly-throttled
        - ["*", "default", "queues_shard_a"] # catchall on k8s
...
redisYmlOverride:
  queues_shard_a: ... contains details for redis-sidekiq-catchall-a

Other options:

  • redis-sidekiq-catchall / redis-sidekiq-catchall-a. I'm leaning away from this as it could be overly restrictive for the future. i.e. we may want to move 2 queues to the next Redis instance, what then should we name that new instance?
  • redis-sidekiq-shard-01 -- this could create confusion with redis-cluster's fqdn since redis-cluster uses shard and 01 too.

1. Secrets setup

We will need more secrets than usual since we have 2 extra users: sentinel and gitlab_monitor.

Before running the following, we may need to create the vault using gkms-vault-create. Alternatively we could keep the information in the redis-cluster vault.

./bin/gkms-vault-edit redis <ENV>
{
  ...,
  "redis-sidekiq-catchall-a": {
    "sentinel_conf": {
      "user": "sentinel",
      "password": "<SENTINEL_REDACTED>"
    },
    "redis_conf": {
      "masteruser": "replica",
      "masterauth": "<REPLICA_REDACTED>",
      "user": [
        "default off",
        "replica on ~* &* +@all ><REPLICA_REDACTED>",
        "sentinel on ~* &* +@all ><SENTINEL_REDACTED>",
        "console on ~* &* +@all ><CONSOLE_REDACTED>",
        "redis_exporter on +client +ping +info +config|get +cluster|info +slowlog +latency +memory +select +get +scan +xinfo +type +pfcount +strlen +llen +scard +zcard +hlen +xlen +eval allkeys ><EXPORTER_REDACTED>",
        "gitlab_monitor on +client +ping +info +config|get +info +slowlog +select +get +scan +zscan +sscan +type +hget +hmget +hgetall +exists +zrangebyscore +zrange +zcount +llen +lrange +lindex +scard +zcard allkeys ><SK_EXPORTER_REDACTED>",
        "rails on ~* &* +@all -debug ><RAILS_REDACTED>"
      ]
    }
  }
}

Note that gitlab_monitor allowed read commands are obtained from:

./bin/gkms-vault-edit redis-exporter <ENV>

...
{
  "redis_exporter": {
    "redis-sidekiq-catchall-a": {
      "env": {
        "REDIS_PASSWORD": "<EXPORTER_REDACTED>"
      }
    }
  }
}

gitlab monitor credentials needs to be loaded into gitlab-omnibus-secrets in this structure (referencing https://gitlab.com/gitlab-cookbooks/gitlab-monitor/-/blob/master/recipes/redis.rb#L18)

"omnibus-gitlab": {
  "gitlab_rb": {
    "redis-sidekiq-catchall-a": {
      "password": <SK_EXPORTER_REDACTED>
    }
  }
}

2. Chef roles setup

The chef-roles will need to include the gitlab-monitor::redis recipe.

Chores todo

  1. Add gitlab-redis cookbook (https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/4527)
  2. Bump version for gitlab-monitor cookbook in gstg (https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/4533)

3. VM provisioning

This is fairly straightforward and can be done on config-mgmt. MR at https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/7994

4. Assigning replicas and masters

Referencing #2049 (comment 1202287292), we may need to set the replicaof.

We could create a script for runbook which takes in a list of hosts and:

  1. check for a 1 master - n replica topology
  2. set replicaof if (1) is false.
Edited by Sylvester Chin