Improve GitLab Self-Monitoring by creating a default Project for Instance Administration - MVC
Problem to solve
GitLab has been adding the ability for administrators to see insights into the health of their GitLab instance. In order to surface this experience in a native way, similar to how they would interact with an application they deployed via Gitlab, we will add a base project to all GitLab instances specifically created for visualizing and configuring the monitoring of their GitLab instance. This will eventually extend to creating incidents when your GitLab instance is behaving incorrectly.
Sasha, Software Developer, https://design.gitlab.com/research/personas#persona-sasha
Devon, DevOps Engineer, https://design.gitlab.com/research/personas#persona-devon
Sidney, Systems Administrator, https://design.gitlab.com/research/personas#persona-sidney
We should pre-create this project by default with no interaction from users. It should be created during new installations of Gitlab and it should also be created for existing installations (using a background migration), with the following settings:
- Project name: "GitLab Instance Administration".
- Project visibility by default is internal
- Membership by default includes the default root admin user.
- Prometheus integration is enabled and configured for the internal Prometheus server.
- Prometheus is configured to send alert webhooks to GitLab.
Steps to achieve the above proposal
MR to copy
prometheus['listen_address']setting to gitlab.yml - omnibus-gitlab!3383 (merged), https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30153, https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/14511
We may also need an
alertmanager['receivers']setting (in omnibus's
gitlab.rbfile) to allow a webhook to be configured.
- MR for whitelisting of localhost servers so that the prometheus server can be connected to the project even if "Outbound requests to localhost" are blocked. - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30350
Service to create the project. - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30153 and https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/14511
- Create a project with readme.
- Connect the inbuilt prometheus as a manually configured prometheus service in the project.
- Add all maintainers as admins.
- Add prometheus internal address to the whitelist implemented in step 3. - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31148
- Add webhook config to alertmanager to send alerts to the rails app. (Did not make it into the MVC, will be worked on in https://gitlab.com/gitlab-org/gitlab-ce/issues/64706)
- Save project_id in ApplicationSettings so that it is easy to retrieve in the future. - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31111
- Change how admins are added as maintainers to the project: Create a group containing all admins and add them as maintainers to the project. - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/30948
Add docs - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31530
- Should contain instructions to manually add an alertmanager webhook.
- Project readme should point to these docs. - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31389
- Migration to execute the above service - https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31389
What does success look like, and how can we measure that?
GitLab is setup to monitor itself out of the box.
There will be a project named 'GitLab Instance Administration' with all admins as maintainers, which is configured to connect to the internal Prometheus that is installed by Omnibus. If an admin configures an alertmanager webhook to send alerts to GitLab (instructions should be in Readme of project), issues will be created for incoming alerts.
Note that the Monitoring dashboard will not show anything since there are no deployments to an environment.