Spike: Backend issue support multiple external prometheus instance
Spike issue based on the following customer request
It seems that the majority of the work should be handled by BE, this spike issue will help us better understand the required effort in order to allow our users to connect multiple Prometheus instances in GitLab
@nagyv-gitlab fyi
Based on weekly groupapm call fomr 5th of August https://youtu.be/pEv1olqEEUY?t=181 I understand that connecting, and using multiple Prometheus instances, will be a premium feature. This should be taken into account when researching possible implementation. // cc @dhershkovitch
Summary
There are 3 major areas of work in order to deliver this feature
1. Storage of multiple Prometheus API configurations
More details available and reasoning why this approach was chosen can be found at: #229144 (comment 391166071)
New relation should be created PromethuesAPIConfig
, it should at least has following columns
name | type |
---|---|
id | bigint |
api_url | text |
headers | jsonb |
cluster_id | bigint |
project_id | binint |
It should take over shared responsibility from PrometheusService
and ApplicationsPrometheus
relations, of storing connection configuration data.
classDiagram
PromethuesAPIConfig --o PrometheusService
PromethuesAPIConfig -- ApplicationsPrometheus
class PromethuesAPIConfig {
# add
+ api_url
+ headers
priority
}
class PrometheusService{
# remove
- api_url
- headers
}
class ApplicationsPrometheus{
# remove
- api_url
- headers
}
Later on records of this relation can be chosen by metrics dashboard viewer in order to browse data from different Prometheus Instances
Plan onto migrating Prometheus configuration to new relation
Each point no following list can be turned into at least one independent issue.
- Create
Metrics::PrometheusApiConfiguration
relation #235722 (closed) - Create Policies to define access rights to new relation based on users role in project
- Create
create, update, delete (CUD)
services to manageMetrics::PrometheusApiConfiguration
entries - Duplicate CUD actions made on
PrometheusSerivce
relation, to be also applied on correspondingMetrics::PrometheusApiConfiguration
entries. - Run background migration that will copy records added prior step 4 to
PrometheusSerivce
relation intoMetrics::PrometheusApiConfiguration
- Change
Metrics::PrometheusApiConfiguration
to be source of configuration data (after this step it may be good to leave at lest one milestone of probation period) - Stop applying new changes to
PrometheusSerivce
relation - Remove one manual Prometheus configuration per project restrictions for EE users.
- Remove Prometheus API configuration related code from
PrometheusSerivce
model.
2. Applying selected API configuration
To allow user select desired API configuration that can be achieved with following steps:
- Pass
datasource_id
parameter toEnvironment#metrics
action, than include it indashboard-endpoint
returned byEnvironmentsHelper#project_and_environment_metrics_data
- Pass
datasource_id
parameter throughMetricsDashboard#metrics_dashboard
into::Gitlab::Metrics::Dashboard::Finder
and than into::Gitlab::Metrics::Dashboard::Stages::MetricEndpointInserter
so it can be added to each metricsprometheus_endpoint_path
- Handle
datasource_id
parameter atMetrics::Dashboard::PrometheusApiProxy
so correct entry (either environment of cluster) is selected and handed over toPrometheus::ProxyService
- Add reference to
datasource
(for the MVC those can be eitherPrometheusService
orApplicationsPrometheus
, and after that it should only corresponds to records from
Metrics::PrometheusApiConfiguration
) toPrometheusAlert
relation - Handle
datasource_id
parameter atProjects::Prometheus::AlertsController#create
so correct cluster entry is selected to be target ofschedule_prometheus_update!
method - Handle
datasource_id
parameter atProjects::Prometheus::AlertsController#index
andProjects::Prometheus::AlertsController#show
so displayed alerts on metrics dashboard corresponds to selected Prometheus instance
Where datasource_id
carries GlobalId of entry storing Prometheus API configuration (for the MVC those can be either PrometheusService
or ApplicationsPrometheus
, and after that it should only corresponds to records from
Metrics::PrometheusApiConfiguration
)
Steps 1 and 2 can be delivered separately, while steps 3 to 6 should be bundle together to avoid inconsistent behavior
Issue #235807 (closed) POC !39352 (closed) demo https://www.youtube.com/watch?v=3ZFSztgGVxw&feature=youtu.be
3. Exposing available API configuration
New endpoint is needed that will expose all available API configurations. For the MVC it can use GlobalId to accommodate shared responsibility for storing Prometheus API configurations in both PrometheusService
and ApplicationsPrometheus
relations. Later on it can either keep using GlobalId of new PrometheusApiConfiguration
relation, or simplify to just record id from the database. MVC issue: #235809 (closed)
Iterations
For area of work 1. It can be delivered independently of the 2 and 3, and will take more time as it require running data migration and making sure that, after the changes no data is lost. Listed steps up to 5th can be delivered within single milestone, but don't have to, depending on priority that this feature will have, than I will recommend taking a pause for one milestone to assure that there are no critical bugs which could cause data lose, and after that the rest can follow. Important note: Even with area of work 1 fully implemented users will not gain any benefit until area 2 and 3 are done
I suggest to start with 2nd area of work, which can be tested right away, without any other area being yet implemented. Than follow it up with 3rd are of work. Having them both in place will allow users to switch between external Prometheus instance and those coming from gitlab managed application.