[Gstg] Roll out queue-per-shard to workers of shard `elasticsearch`

Production Change

Change Summary

Please read scalability#1136 for more information. This is a change issue to route all jobs of workers in elasticsearch shard to elasticsearch queue on Staging

Change Details

  1. Services Impacted - ServiceSidekiq ServiceAPI ServiceWeb ServiceGit
  2. Change Technician - @cmiskell / @qmnguyen0711
  3. Change Reviewer - @cmiskell / @qmnguyen0711
  4. Time tracking - 2hr
  5. Downtime Component - No downtime

Detailed steps for the change

Pre-Change Steps - steps to be completed before execution of the change

*Estimated Time to Complete (10 mins)

Change Steps - steps to take to execute the change

*Estimated Time to Complete (80 mins)

Post-Change Steps - steps to take to verify the change

*Estimated Time to Complete (30 mins)

As elasticsearch is a busy shard, we just need to wait for a while, and observe the shard, queue, and worker logs from Kibana: https://nonprod-log.gitlab.net/goto/2721b6c52bee324d3dfeeca5fd62f247. If there are no logs, try to edit something, like issue title, and wait for the reindex event The expected result is that json.queue will change from a per-worker name to elasticsearch

Rollback

Rollback steps - steps to be taken in the event of a need to rollback this change

Estimated Time to Complete (90 mins)

Monitoring

Key metrics to observe

After the changes are applied, we should verify the changes so that the queue field in the Kibana logs should show elasticsearch for all aforementioned workers. Also, there aren't be any abnormal behaviors with the jobs afterward.

Summary of infrastructure changes

  • Does this change introduce new compute instances?
  • Does this change re-size any existing compute instances?
  • Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?

None

Changes checklist

  • This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities.
  • This issue has the change technician as the assignee.
  • Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed.
  • Necessary approvals have been completed based on the Change Management Workflow.
  • Change has been tested in staging and results noted in a comment on this issue.
  • A dry-run has been conducted and results noted in a comment on this issue.
  • SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncall and this issue and await their acknowledgement.)
  • There are currently no active incidents.
Edited by Craig Miskell