Upgrade Prometheus to 2.30.0

We're mainly looking at upgrading to include the following fix: https://github.com/prometheus/prometheus/pull/9259

  1. Ensure no breaking changes are included.
  2. Prepare GSTG MRs.
  3. Update GSTG.
  4. Observe the monitoring::overview metrics as well as a sample of other metrics (patroni, gitaly).
  5. Prepare GPRD MRs.
  6. Create a silence similar to https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14152#note_678016129
  7. Disable chef-client on affected VMs.
  8. Merge GPRD MRs.
  9. Perform a controlled rollout to VMs.
  10. Enable chef-client on affected VMs.
  11. Manually trigger to GPRD zones, one at a time.
  12. Observe the monitoring::overview metrics as well as a sample of other metrics (patroni, gitaly).
  13. Revert GSTG MRs
Edited by Rehab