Skip to content

Add service to bulk process alerts

Sarah Yasonik requested to merge sy-add-bulk-process-alerts-service into master

What does this MR do and why?

Context:

When a project has an alert integration configured, prometheus servers can be configured to send alerts to GitLab. When we receive a prometheus request, it may contain many unique alerts within the same payload. Currently, we handle each alert individually. However, there's no strict limit on the Prometheus side to the number of alerts a notification can contain. So we've got a pretty hefty N+1.

Related issue: https://gitlab.com/gitlab-org/gitlab/-/issues/348676

Changes in this MR:
  • Adds a service to process alerts in bulk (which is not yet utilized)
  • Adds new/replacement specs for the alert processing flow
    • The new specs for BulkProcessAlertsService will eventually replace the examples in spec/support/shared_examples/services/alert_management/alert_processing/
Scope:
  • BulkProcessAlertsService will be utilized in !95834 (closed) & !95854 (closed)
  • Removal/cleanup of the old classes/modules/specs will happen in a future MR
Motivation:
  • By adding a new service instead of reworking the existing AlertProcessing module, we can fully separate the effects of alert processing from auth and from payload interpretation. Isolation of these responsibilities makes it easier to test the code and makes it clearer where tests should be
  • The new specs group expectations into significantly fewer examples. This is to
    • create fewer records
    • run the processing logic fewer times
    • make the expected behavior more explicit

Holistic overview

Related MRs

All MRs will merge into master, but should merge in the order below

flowchart RL
    master
    95827["1. Add service (!95827)"]
    95834["2.a. Use service for generic alerts (!95834)"]
    95854["2.b. Use service for prometheus alerts (!95854)"]
    TBD["3. Delete unneeded tests/files (MR TBD)"]

    95827-->master
    95834-->95827
    95854-->95827
    TBD-->95834
    TBD-->95854
Expected data flow for alert processing
  1. Alert comes into integration endpoint
  2. Controller determines which NotifyService to use
  3. NotifyService checks permissions & validity
  4. NotifyService parses input into Gitlab::AlertManagement::Payload` format
  5. NotifyService passes payloads to BulkProcessAlertsService
  6. BulkProcessAlertsService is responsible for finding any existing alerts for the payloads and triggering any side-effects
  7. For each alert, BulkProcessAlertsService delegates to UpdateAlertFromPayloadService to modify the alert itself.
    • As a note: This is still an N+1, but it will stay for now, since we return the id of each alert in the HTTP response. The expected system notes & record updates will vary by alert, so we still need to run validations on save.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Sarah Yasonik

Merge request reports