Third test rollout of Ruby 3.1 package to gstg-ref, gstg-cny

Proposal

Deploy a Ruby 3.1 package to gstg-ref before the planned production rollout.

During the 2nd test rollout gstg-ref was broken and we couldn't verify Sidekiq running on Ruby 3.1. This test rollout will help gives us confidence of running Sidekiq with Ruby 3.1.

Deploy to gstg-ref (and gstg-cny)

Run git commit --allow-empty -m "Empty commit to trigger a new auto-deploy pkg" to add an extra commit to the Omnibus or CNG auto deploy branch.
Trigger a deployment pipeline by running the "MANUAL auto-deploy pick&tag" inactive manual scheduled pipeline: https://ops.gitlab.net/gitlab-org/release/tools/-/pipeline_schedules/.
Cancel the packager pipelines created in the previous step and manually start new pipelines on the same tags. Set the following variables when starting the pipeline: USE_NEXT_RUBY_VERSION_IN_AUTODEPLOY to true and NEXT_RUBY_VERSION to 3.1.4. The NEXT_RUBY_VERSION variable might not be required, depending if the 2nd and 3rd steps in gitlab-org&11659 (closed) are checkmarked. Check with @balasankarc to confirm.
Notify Slack channels #development, #backend, #frontend and #staging-ref that the deployment has started.
Notify @igor.drozdov, @mkaeppler and @niskhakova when the deployment is done so they can proceed with Monitor second experimental Ruby 3.1 package roll out to staging.

Continue auto deploys

Allow the package to bake for 1 hour on gprd-cny.
Once a new package is deployed to gstg-cny and gstg-ref and gprd-cny, notify the Slack channels that the Ruby 3.1 package is no longer on those environments.

Key metrics to observe

Dashboards/metrics:
- Monitor the following dashboards for unhealthy dip in service health for the environment/cluster that is being rolled out.
- Deployment health, configurable with environment, stage, and type/service
- Kubernetes compute resource/cluster health, configurable with clusters
- Kubernetes compute resource/pods health, configurable with clusters and namespace
- Kubernetes networking, configurable with clusters
- Per-service dashboards (change env and stage to toggle between gstg/gprd and main/cny):
  - api (overview, containers)
  - web (overview, containers)
  - websockets (overview, containers)
  - git (overview, containers)
  - sidekiq (overview, containers)
- Kibana - Puma (edit json.type to filter by service, json.stage for cny vs main)
  - Staging 5xx responses
- Kibana - Sidekiq (edit json.shard to switch between job types)
  - Failed staging jobs
- GCP - staging-ref logs
- Sentry
  - Staging overview
QA runs can be observed via Slack:
- #announcements - Besides QA messages, multiple messages are sent to this channel to account for the different deployments.
- QA slack channels - There is a channel per environment, for example, a failure on gstg and gstg-cny will be posted in #qa-staging, a failure on gprd-cny and gprd will be posted in #qa-production, etc.
Dealing with deploy failures: https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/deploy/failures.md

Edited Nov 21, 2023 by Reuben Pereira