Skip to content

Add support for Prometheus as HTTP alert integrations

Sarah Yasonik requested to merge sy-setup-http-integrations-for-prometheus into master

What does this MR do and why?

Related issue: Allow Prometheus' metrics dashboard and incomin... (#338838)

Summary

  • This MR is preparation for migrating Integrations::Prometheus records to the table for AlertManagement::HttpIntegration.
  • There are no user-facing changes.
  • This enables us to separate alerting from metrics, so we can remove the Metrics Dashboard feature in %16.0.

Changes in this MR

Add type_identifier attribute to HTTP integration table/model
  • Add column type_identifier to alert_management_http_integrations table
  • Add type_identifier enum to AlertManagement::HttpIntegration model, with values for only :http and :prometheus
    • Notable: For backwards compatibility, the Integrations::Prometheus records will be migrated with a legacy identifier, so we can still accept alerts to the same endpoints, even though the URL doesn't match the default format for new integrations.
Modify CRUD services to accommodate new attribute
  • Add support to CreateService for the new attribute, defaulting to :http when type_identifier is absent
    • Notable: Multiple generic HTTP integrations is a GitLab Premium feature, but Prometheus integrations were limited to 1 per project wholesale. That was an artifact of piggy-backing on Integrations::Prometheus. This MR changes the limit. It's now 1 HTTP integration per type for GitLab Free. So the behavior is the exact same for GitLab Free, but expanded for GitLab Premium.
  • Add support to UpdateService to change the type of integrations (accounting for licensed feature availability)
  • Add support to DestroyService to remove prometheus type integrations
    • Notable:
      • Integrations::Prometheus records can't be deleted once they've been persisted. And we plan to create a corresponding record in alert_management_http_integrations for each Integrations::Prometheus ever configured to accept alerts.
      • If a corresponding http_integration does not exist, it means either:
          1. the migration hasn't been completed yet, or
          1. the integration was deleted.
      • To rule out #2, I opted to prevent deletion of the legacy prometheus integrations from the HTTP integrations table. This lets us use presence checks to determine which record should act as SSOT for that integration. Then we aren't switching over too early before the migration has run or trying to keep the state between the records in sync.

Why no new index?

  1. There are only ~1500 alert_management_http_integrations on dot-com, so the scale is decently small.
  2. Searching for an HTTP integration is always scoped by project. Queries beyond that are a mix-n-match of searching by active, endpoint_identifier, type_identifier, and/or id. GitLab Free projects will also have a max of 2 alert_management_http_integrations, so those definitely won't be benefitted by a new index.
  3. The most common query (alert ingestion) is already covered by index_http_integrations_on_active_and_project_and_endpoint, but the other queries should be very infrequent, as this configuration is highly unlikely to change often. And some will be removed with #409734, so we won't have the queries for a long time.

Why no feature flag?

I initially started development with a feature flag, but the diff was very convoluted. So for this change, I think it's safer to have a clean diff, clear expectations of behavior, and backwards compatibility without the flag.

What about all the other stuff this needs?

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Sarah Yasonik

Merge request reports