Implement cronjob to adjust vulnerability statistics everyday (!36040) · Merge requests · GitLab.org / GitLab

Mehmet Emin INAC requested to merge periodically_update_the_vulnerability_statistics into master Jul 06, 2020

What does this MR do?

This MR introduces a new cronjob which runs on every midnight to adjust the vulnerability statistics for all projects which have vulnerabilities.

Since calculating the vulnerability statistics for all projects can take quite long, we are scheduling new jobs for each 1000 projects and calculating the vulnerability statistics in those jobs.

The reason for this approach is;

In case if the Sidekiq process crushes(segfault etc.), calculation will fail and the statistics will not be updated
In case if the Sidekiq process receives SIGINT or SIGTERM, it will wait for preconfigured duration to let the jobs finish for graceful shutdown but in case if there are some jobs still running, it just stops the thread without pushing the job back to the queue on Redis. Which means, if we have long running jobs, this can cause problems.
In case if the Sidekiq process receives SIGKILL, there is no way the Sidekiq process can push the job back so the update will be missing for all the remaining projects.

Doing the calculations in separate jobs will mitigate or solve the issues mentioned above.

Note: In the future, we will also use this service class to create historical statistics for the previous day so the endpoint which returns vulnerability statistics will just read records instead of doing heavy calculations with intensive joins on database layer.

For Database Review

The query for `gitlab-org/gitlab`

EXPLAIN ANALYZE INSERT INTO vulnerability_statistics (project_id, total, info, unknown, low, medium, high, critical, letter_grade, created_at, updated_at) (
    SELECT
        severity_counts.*,
        (
            CASE WHEN severity_counts.critical > 0 THEN
                4
            WHEN severity_counts.high > 0
                OR severity_counts.unknown > 0 THEN
                3
            WHEN severity_counts.medium > 0 THEN
                2
            WHEN severity_counts.low > 0 THEN
                1
            ELSE
                0
            END) AS letter_grade,
        now() AS created_at,
        now() AS updated_at
    FROM (
        SELECT
            vulnerabilities.project_id AS project_id,
            COUNT(*) AS total,
            COUNT(*) FILTER (WHERE severity = 1
                AND state IN (1, 4)) AS info,
            COUNT(*) FILTER (WHERE severity = 2
                AND state IN (1, 4)) AS unknown,
            COUNT(*) FILTER (WHERE severity = 4
                AND state IN (1, 4)) AS low,
            COUNT(*) FILTER (WHERE severity = 5
                AND state IN (1, 4)) AS medium,
            COUNT(*) FILTER (WHERE severity = 6
                AND state IN (1, 4)) AS high,
            COUNT(*) FILTER (WHERE severity = 7
                AND state IN (1, 4)) AS critical
        FROM
            vulnerabilities
        WHERE
            vulnerabilities.project_id = 278964
        GROUP BY
            vulnerabilities.project_id) AS severity_counts)
ON CONFLICT (project_id)
    DO UPDATE SET
        total = EXCLUDED.total,
        info = EXCLUDED.info,
        unknown = EXCLUDED.unknown,
        low = EXCLUDED.low,
        medium = EXCLUDED.medium,
        high = EXCLUDED.high,
        critical = EXCLUDED.critical,
        letter_grade = EXCLUDED.letter_grade,
        updated_at = EXCLUDED.updated_at

The query plan for on database labs;

1st run(internal only link);

Time: 1.693 s
  - planning: 0.319 ms
  - execution: 1.693 s
    - I/O read: 1.417 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 692 (~5.40 MiB) from the buffer pool
  - reads: 987 (~7.70 MiB) from the OS file cache, including disk I/O
  - dirtied: 56 (~448.00 KiB)
  - writes: 0

2nd run(internal only link);

Time: 5.825 ms
  - planning: 0.584 ms
  - execution: 5.241 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 1649 (~12.90 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

[-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
[-] Tested in all supported browsers
[-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

[-] Label as security and @ mention @gitlab-com/gl-security/appsec
[-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
[-] Security reports checked/validated by a reviewer from the AppSec team

Edited Jul 08, 2020 by Mehmet Emin INAC

Implement cronjob to adjust vulnerability statistics everyday