Deploy new Postgres 12 replica for GCS snapshots and cascade replication source
Production Change
This change creates a new replica in the production Postgres 12 cluster that will be used to generate GCS snapshots and serve as a cascade replication source.
Change Summary
Provide a high-level summary of the change and its purpose.
Change Details
- Services Impacted - ServicePatroni
- Change Technician - @glopezfernandez
- Change Criticality - C1
- Change Type - changeunscheduled
- Change Reviewer - @Finotto
- Due Date - 2021-05-08 20:00 UTC
- Time tracking - 5mins
- Downtime Component - None
Detailed steps for the change
Pre-Change Steps - steps to be completed before execution of the change
Estimated Time to Complete (mins) - 5 mins
-
Set label changein-progress on this issue -
Merge https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2545 -
tf://environment/gprd# tf plan -out plan -
Validate there are 4 new items to add: a boot disk, a log disk, a data disk, and an instance.
Change Steps - steps to take to execute the change
Estimated Time to Complete (mins) - 5 mins
-
tf://environment/gprd# tf apply plan
Post-Change Steps - steps to take to verify the change
Estimated Time to Complete (mins) - 1 min
-
Verify new node in cluster with gitlab-patronictl listwithnoloadbalancerandnofailoverlabels
Rollback
Rollback steps - steps to be taken in the event of a need to rollback this change
Estimated Time to Complete (mins) - Estimated Time to Complete in Minutes
Monitoring
Key metrics to observe
Summary of infrastructure changes
-
Does this change introduce new compute instances?: Yes -
Does this change re-size any existing compute instances? No -
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc? *No
New Postgres 12 replica instance with noloadbalance and nofailover tags.
Changes checklist
-
This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities. -
This issue has the change technician as the assignee. -
Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed. -
Necessary approvals have been completed based on the Change Management Workflow. -
Change has been tested in staging and results noted in a comment on this issue. -
A dry-run has been conducted and results noted in a comment on this issue. -
SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncalland this issue and await their acknowledgement.) -
There are currently no active incidents.
Edited by Gerardo Lopez-Fernandez