Scale up web fleet
C3
Production Change - Criticality 3Change Objective | Describe the objective of the change |
---|---|
Change Type | Operation |
Services Impacted | ~"Service:Web" |
Change Team Members | @hphilipps |
Change Severity | C3 |
Change Reviewer or tested in staging | A colleague will review |
Dry-run output | If the change is done through a script, it is mandatory to have a dry-run capability in the script, run the change in dry-run mode and output the result |
Due Date | Date and time in UTC timezone for the execution of the change, if possible add the local timezone of the engineer executing the change |
Time tracking | To estimate and record times associated with changes ( including a possible rollback ) |
Detailed steps for the change. For each step the following must be considered:
- pre-conditions for execution of the step - how to verify it is safe to proceed
-
ensure 38 web nodes active
-
- execution commands for the step - what to do
-
add 2 nodes via terraform: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/merge_requests/1184 -
make sure they are added to chef -
add new nodes to gprd-base-lb-fe
role: https://ops.gitlab.net/gitlab-cookbooks/chef-repo/merge_requests/2155 -
run chef-client
-
- post-execution validation for the step - how to verify the step succeeded
-
Make sure nodes are registered by the fe LBs and traffic is being sent to them -
Make sure new nodes are successfully processing requests
-
It is strongly recommended to:
- Note relevant graphs in grafana to monitor the effect of the change, including how to identify that it has worked, or has caused undue negative effects
- Review alerts that may go off that can be silenced pro-actively
Rollback steps
- revert https://ops.gitlab.net/gitlab-cookbooks/chef-repo/merge_requests/2155
- revert https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/merge_requests/1184
- run
terraform apply -target=module.web
Edited by Henri Philipps