Group level prometheus only persists alerts for one project
Summary
Spun out of this comment: #26704 (comment 222571810)
When a group-level cluster has a GitLab-managed Prometheus, only the alerts for the project with the most recently updated alert are persisted on the Prometheus server.
Steps to reproduce
-
Create a group with a group-level cluster
-
Install Prometheus
-
Create a couple of projects under the group and deploy both of them
-
Connect to your kubernetes cluster and observe
/etc/config/alertson the server. For examplekubectl -n gitlab-managed-apps exec -it $PROMETHEUS_POD_NAME -c prometheus-server watch cat /etc/config/alertsreplacing
$PROMETHEUS_POD_NAMEwith whatever the currently deployed prometheus server pod is. -
Create an alert on the first project and wait until they are reflected in
/etc/config/alerts. Make note of the values -
Create an alert on the second project and observe that they overwrite all of
/etc/config/alerts.
What is the current bug behavior?
/etc/config/alerts only contains alerts from one of the two projects
What is the expected correct behavior?
/etc/config/alerts should have alerts from both projects
Relevant logs and/or screenshots
Recording of the alerts file changing on a project I configured locally after modifying an alert on the second project, as described above. The projects in my case are called the-group/minimal-ruby-app and the-group/minimal-ruby-app-2.
Results of GitLab environment info
Observed in GDK on recent master
Possible fixes
The code responsible is ultimately in Clusters::Applications::Prometheus#upgrade_command, but the #update_command inherited from ApplicationCore is wrong in a similar way, as it overwrites /etc/config/alerts with a blank file.
Alert persistence needs to be fixed in order for Prometheus chart updates to be feasible #26704 (closed)
