Skip to content

Enable puma fleet-wide on gitlab.com

5e4d3349c24c7011002701

Production Change - Criticality 3 C3

Change Objective Describe the objective of the change
Change Type ConfigurationChange
Services Impacted rails
Change Team Members Name of the engineers involved in the change
Change Severity C3
Change Reviewer or tested in staging A colleague who will review the change or evidence the change was tested on staging environment
Dry-run output If the change is done through a script, it is mandatory to have a dry-run capability in the script, run the change in dry-run mode and output the result
Due Date Date and time in UTC timezone for the execution of the change, if possible add the local timezone of the engineer executing the change
Time tracking To estimate and record times associated with changes ( including a possible rollback )

Detailed steps for the change

Staging validation

Note: We will be verifying on staging that we can switch to unicorn and back to puma since staging is already running puma

  • Stop chef on all all nodes where unicorn/puma is running
knife ssh 'roles:gstg-base-fe-api OR roles:gstg-base-fe-web OR roles:gstg-base-fe-git' 'sudo service chef-client stop'
  • Switch staging to unicorn (we will take a short interruption on staging for this step, ~2minutes)

  • Merge the role update to switch to puma

  • Execute a rolling pipeline with haproxy drains to switch nodes to puma

    • /chatops run deploycmd chefclient base_fe_web --no-check
    • /chatops run deploycmd chefclient base_fe_api --no-check
    • /chatops run deploycmd chefclient base_fe_git --no-check

Production monitoring

It will be extremely important to monitor the fleet before and after this change for any latency degradation

Dashboards

Logs

Production apply

  • Precheck: Confirm that all services are meeting our SLOs on dashboards
  • Precheck: From logs, note workhorse 95th and 60th percentile latency duration_ms for api/web/git and note it here: Screen_Shot_2020-02-19_at_3.43.34_PM

Screen_Shot_2020-02-19_at_3.43.26_PM

  • Stop chef on all all nodes where unicorn/puma is running
knife ssh 'roles:gprd-base-fe-api OR roles:gprd-base-fe-web OR roles:gprd-base-fe-git' 'sudo service chef-client stop'

Rollback steps

  • Ensure chef is stopped on all nodes
knife ssh 'roles:gprd-base-fe-api OR roles:gprd-base-fe-web OR roles:gprd-base-fe-git' 'sudo service chef-client stop'
  • Rollback MR to disable unicorn, and enable puma https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/merge_requests/2688
  • Execute a rolling pipeline with haproxy drains to switch nodes to unicorn
    • /chatops run deploycmd chefclient base_fe_web --production --no-check
    • /chatops run deploycmd chefclient base_fe_api --production --no-check
    • /chatops run deploycmd chefclient base_fe_git --production --no-check

Changes checklist

  • Detailed steps and rollback steps have been filled prior to commencing work
  • Person on-call has been informed prior to change being rolled out
Edited by John Jarvis