Follow-up from "Add API endpoint for vulnerability history at group level"

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

  • Close this issue

The following discussion from !8603 (merged) should be addressed:

  • @abrandl started a discussion: (+3 comments)

    @gonzoyumo I'm not exactly sure about this: Is there some sort of redundancy in the query?

    EXPLAIN ANALYZE
    SELECT
        CAST(vulnerability_occurrence_pipelines.created_at AS DATE) AS DAY,
        "vulnerability_occurrences"."severity",
        count(DISTINCT vulnerability_occurrences.id)
    FROM
        "vulnerability_occurrences"
        INNER JOIN "vulnerability_occurrence_pipelines" ON "vulnerability_occurrence_pipelines"."occurrence_id" = "vulnerability_occurrences"."id"
    WHERE
        "vulnerability_occurrence_pipelines"."pipeline_id" IN (
            SELECT
                "ci_pipelines"."id"
            FROM
                "ci_pipelines"
            WHERE
                "ci_pipelines"."project_id" IN (
                    SELECT
                        "projects"."id"
                    FROM
                        "projects"
                        INNER JOIN routes rs ON rs.source_id = projects.id
                            AND rs.source_type = 'Project'
                    WHERE (rs.path LIKE 'olivier/%'))
                AND (EXISTS (
                        SELECT
                            1
                        FROM
                            "vulnerability_occurrence_pipelines"
                        WHERE (ci_pipelines.id = vulnerability_occurrence_pipelines.pipeline_id)))
                AND "ci_pipelines"."status" = 'success')
            AND (vulnerability_occurrence_pipelines.created_at >= '2018-08-27')
        GROUP BY
            "day",
            "vulnerability_occurrences"."severity"
        ORDER BY
            DAY;

    I was wondering if this part is redundant to the JOIN condition:

                AND (EXISTS (
                        SELECT
                            1
                        FROM
                            "vulnerability_occurrence_pipelines"
                        WHERE (ci_pipelines.id = vulnerability_occurrence_pipelines.pipeline_id)))

    I don't know the model too well though, feel free to ignore. :) If this can be removed, it speeds up the query quite a lot and the plan gets a tad nicer.

    I shortened the query to make this more readable to the relevant part below (the EXISTS(...) is redundant IMHO).

    SELECT
    *
    FROM
        "vulnerability_occurrence_pipelines"
    WHERE
        "vulnerability_occurrence_pipelines"."pipeline_id" IN (
            SELECT
                "ci_pipelines"."id"
            FROM
                "ci_pipelines"
            WHERE
                "ci_pipelines"."project_id" IN (...)
                AND (EXISTS (
                        SELECT
                            1
                        FROM
                            "vulnerability_occurrence_pipelines"
                        WHERE (ci_pipelines.id = vulnerability_occurrence_pipelines.pipeline_id)))
            )
Edited Jun 26, 2025 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading