Upgrade Prometheus to 2.30.0
We're mainly looking at upgrading to include the following fix: https://github.com/prometheus/prometheus/pull/9259
-
Ensure no breaking changes are included. -
Prepare GSTG MRs. -
Update GSTG. -
Observe the monitoring::overview metrics as well as a sample of other metrics (patroni, gitaly). -
Prepare GPRD MRs. -
Create a silence similar to https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/14152#note_678016129 -
Disable chef-client on affected VMs. -
Merge GPRD MRs. -
Perform a controlled rollout to VMs. -
Enable chef-client on affected VMs. -
Manually trigger to GPRD zones, one at a time. -
Observe the monitoring::overview metrics as well as a sample of other metrics (patroni, gitaly). -
Revert GSTG MRs
Edited by Rehab