Cleanup configurations and code for migration Redis::Sessions

Application side changes

  • update chef configuration to point sessions to the ServiceRedisClusterSessions
    • gstg: https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/5602
    • gprd: https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/5603
  • update k8s configuration to point sessions to the ServiceRedisClusterSessions
    • gstg: gitlab-com/gl-infra/k8s-workloads/gitlab-com!4167 (merged)
    • gprd: gitlab-com/gl-infra/k8s-workloads/gitlab-com!4168 (merged)
  • CRs to coordinate above MRs:
    • gstg: gitlab-com/gl-infra/production#19340 (closed)
    • gprd: gitlab-com/gl-infra/production#19462 (closed)
  • remove MultiStore, multistore feature flags and ClusterSessions helper classes: gitlab-org/gitlab!181631 (merged)
  • delete MultiStore feature flags via ChatOps once the MR above reaches production
    • gitlab-org/gitlab#509337 (closed)
    • gitlab-org/gitlab#509338 (closed)
  • remove environment variable USE_REDIS_CACHE_STORE_AS_SESSION_STORE and Gitlab::Sessions::RedisStore implementation:
    • Rails MR: gitlab-org/gitlab!181637 (merged) (isn't blocked by CRs above nor MultiStore removal MR)
    • gstg k8s workload MR: gitlab-com/gl-infra/k8s-workloads/gitlab-com!4169 (merged)
    • gprd k8s workload MR: gitlab-com/gl-infra/k8s-workloads/gitlab-com!4170 (merged)

Config clean up

  • remove cluster_sessions, cluster_db_load_balancing, and any other dangling cluster_* configurations.
    • gstg removal:
      • chef, https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/5744
      • k8s
    • gprd removal:
      • chef,
      • k8s

Removal of ServiceRedisSessions

  • removing runbook service gitlab-com/runbooks!8627 (merged)
    • switch redis cluster's storage selector to sessions
  • removing VMs
    • gstg: https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/10525
    • gprd: https://ops.gitlab.net/gitlab-com/gl-infra/config-mgmt/-/merge_requests/10526
    • (We didn't do this but we should next time) Create silence for GCPScheduledSnapshotsDelayed to avoid paging EOC. See details below.
  • removing chef roles
    • gstg: https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/5719
    • gprd: https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/5720

Note on GCPScheduledSnapshotsDelayed alert

After the VM was removed, there was page for GCPScheduledSnapshotsDelayed alert (incident link). This alert triggers when a disk that should have scheduled snapshots (based on appearing in the snapshot logs in the past week) hasn’t had any snapshots in the last 6 hours. The alert was then silenced for 1 week, after which it would self-resolve.

Edited Mar 19, 2025 by Marco Gregorius
Assignee Loading
Time tracking Loading