[SPIKE] Investigate performance of ClickHouse for Vulnerability Metric Aggregation and query

As a follow up to Spike: Investigate options for searching vulner... (#352665 - closed) • Gregory Havenga (is on PTO from 15 December 2025 to 02 January 2026) • 15.9 and [SPIKE] Investigate ingestion of significant vu... (#397015 - closed) • Unassigned • Backlog, we need to investigate the performance and engineering complexity involved with the metric aggregation against a significant quantity of vulnerability data.

Some projects within GitLab have in excess of hundreds of thousands of vulnerabilities associated with them. We need to try roughly establish our ClickHouse querying strategies to see what we can do synchronously without crippling the server, and what will need to be synchronously queried and cached to ensure a desirable user experience.

Expected Outcomes

  1. Determine how we will execute queries against ClickHouse? Rest API? And is there existing code/gems/queries to work off.
  2. Determine what kind of performance can we reasonably expect querying a standard project vulnerability set. ~10000 vulnerabilities?
  3. What kind of performance can we expect when performing metric aggregation queries (For example, time spent in each state (detected, confirmed, resolved), average time to resolve, count by fields other than severity (currently the only counts in the security dashboard) etc)
  4. What will the impact be on our self-managed users?
    1. Roughly, what sort of ClickHouse cluster capabilities will our self-managed users need to use this effectively?

Timebox: 4 Days