[gprd] Enable the use_primary_and_secondary_stores_for_sessions feature flag
Production Change
Change Summary
This issue is to rollout MultiStore with fallback mechanism on production, which is currently behind the use_primary_and_secondary_stores_for_sessions
feature flag.
MultiStore will use both Redis-SharedState and newly provisioned Redis-Sessions instance (which was provisioned in #5969 (closed))
The rollout issue: scalability#1429 (closed)
Change Details
- Services Impacted - ServiceRedis
- Change Technician - @nmilojevic1
- Change Reviewer - @alejandro
- Time tracking - 45m
- Downtime Component - None
Detailed steps for the change
Incremental rollout. Proposed increments are: 10%
, 50%
, 100%
. The proposed minimum time between increments is 15 minutes.
Pre-Change Steps - steps to be completed before execution of the change
Estimated Time to Complete (mins)
-
Set label changein-progress on this issue
Change Steps - steps to take to execute the change
Estimated Time to Complete (mins) - 45m
-
/chatops run feature set use_primary_and_secondary_stores_for_sessions 10
-
/chatops run feature set use_primary_and_secondary_stores_for_sessions 50
-
/chatops run feature set use_primary_and_secondary_stores_for_sessions 100
Post-Change Steps - steps to take to verify the change
Estimated Time to Complete (mins)
-
Review with EOC
Rollback
Rollback steps - steps to be taken in the event of a need to rollback this change
Estimated Time to Complete (mins) - 1m
-
/chatops run feature set load_balancing_for_build_hooks_worker false
Monitoring
Key metrics to observe
- Metric: Redis-sessions overview]
- Metric: Redis main instance health
- Metric: Prometheus gitlab_redis_multi_store_read_fallback_total
- Metric: Prometheus gitlab_redis_multi_store_method_missing_total
- Metric: Redis::Sessions dashboard
Summary of infrastructure changes
-
Does this change introduce new compute instances? -
Does this change re-size any existing compute instances? -
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?
Changes checklist
-
This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities. -
This issue has the change technician as the assignee. -
Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed. -
This Change Issue is linked to the appropriate Issue and/or Epic -
Necessary approvals have been completed based on the Change Management Workflow. -
Change has been tested in staging and results noted in a comment on this issue. -
A dry-run has been conducted and results noted in a comment on this issue. -
SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncall
and this issue and await their acknowledgement.) -
Release managers have been informed (If needed! Cases include DB change) prior to change being rolled out. (In #production channel, mention @release-managers
and this issue and await their acknowledgment.) -
There are currently no active incidents.
Edited by Nikola Milojevic