Enable puma fleet-wide on gitlab.com

Production Change - Criticality 3 C3

Change Objective	Describe the objective of the change
Change Type	ConfigurationChange
Services Impacted	rails
Change Team Members	Name of the engineers involved in the change
Change Severity	C3
Change Reviewer or tested in staging	A colleague who will review the change or evidence the change was tested on staging environment
Dry-run output	If the change is done through a script, it is mandatory to have a dry-run capability in the script, run the change in dry-run mode and output the result
Due Date	Date and time in UTC timezone for the execution of the change, if possible add the local timezone of the engineer executing the change
Time tracking	To estimate and record times associated with changes ( including a possible rollback )

Note: We will be verifying on staging that we can switch to unicorn and back to puma since staging is already running puma

knife ssh 'roles:gstg-base-fe-api OR roles:gstg-base-fe-web OR roles:gstg-base-fe-git' 'sudo service chef-client stop'

Switch staging to unicorn (we will take a short interruption on staging for this step, ~2minutes)
Merge the role update to switch to puma
Execute a rolling pipeline with haproxy drains to switch nodes to puma
- /chatops run deploycmd chefclient base_fe_web --no-check
- /chatops run deploycmd chefclient base_fe_api --no-check
- /chatops run deploycmd chefclient base_fe_git --no-check

It will be extremely important to monitor the fleet before and after this change for any latency degradation

Precheck: Confirm that all services are meeting our SLOs on dashboards
Precheck: From logs, note workhorse 95th and 60th percentile latency duration_ms for api/web/git and note it here:

knife ssh 'roles:gprd-base-fe-api OR roles:gprd-base-fe-web OR roles:gprd-base-fe-git' 'sudo service chef-client stop'

Merge the role update to switch to puma https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/merge_requests/2688
Execute a rolling pipeline with haproxy drains to switch nodes to puma
- /chatops run deploycmd chefclient base_fe_web --production --no-check
- /chatops run deploycmd chefclient base_fe_api --production --no-check
- /chatops run deploycmd chefclient base_fe_git --production --no-check
- web https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/926420
- api https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/926422
- git https://ops.gitlab.net/gitlab-com/gl-infra/deployer/-/jobs/926424
Postcheck: Confirm that all services are meeting our SLOs on dashboards
Postcheck: From logs, note workhorse 95th and 60th percentile latency duration_ms for api/web/git and note it here:
- api
- web
- git
Remove node overrides on api, web, and git fleet that had puma enabled on individual nodes

knife ssh 'roles:gprd-base-fe-api OR roles:gprd-base-fe-web OR roles:gprd-base-fe-git' 'sudo service chef-client stop'

Rollback MR to disable unicorn, and enable puma https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/merge_requests/2688
Execute a rolling pipeline with haproxy drains to switch nodes to unicorn
- /chatops run deploycmd chefclient base_fe_web --production --no-check
- /chatops run deploycmd chefclient base_fe_api --production --no-check
- /chatops run deploycmd chefclient base_fe_git --production --no-check

Edited Feb 19, 2020 by John Jarvis