Skip to content

Create migration to schedule re-population of historical stats

What does this MR do?

This MR introduces a post-deployment migration to schedule re-population of the historical vulnerability statistics.

Related to #244380 (closed)

Database migration

Post-deployment migration

The following migration schedules the background jobs;

rake db:migrate:up
== 20201119092319 ScheduleRepopulateHistoricalVulnerabilityStatistics: migrating
== 20201119092319 ScheduleRepopulateHistoricalVulnerabilityStatistics: migrated (0.0622s)
rake db:migrate:down
== 20201119092319 ScheduleRepopulateHistoricalVulnerabilityStatistics: reverting
== 20201119092319 ScheduleRepopulateHistoricalVulnerabilityStatistics: reverted (0.0000s)

Query & timing analysis

Post-deployment migration

The post-deployment migration will schedule background jobs for all the projects which have vulnerabilities by using the following query;

SELECT
    "project_settings"."project_id"
FROM
    "project_settings"
WHERE (has_vulnerabilities IS TRUE)
    AND "project_settings"."project_id" >= 1
ORDER BY
    "project_settings"."project_id" ASC
LIMIT $1 OFFSET $2

Which has the following query & execution plan(https://explain.depesz.com/s/4rzh);

Limit  (cost=0.31..1.64 rows=50 width=4) (actual time=0.240..3.403 rows=50 loops=1)
   Buffers: shared hit=4 read=6 dirtied=2
   I/O Timings: read=3.336
   ->  Index Only Scan using index_project_settings_on_project_id_partially on public.project_settings  (cost=0.29..39.88 rows=1491 width=4) (actual time=0.228..3.394 rows=51 loops=1)
         Index Cond: (project_settings.project_id >= 1)
         Heap Fetches: 2
         Buffers: shared hit=4 read=6 dirtied=2
         I/O Timings: read=3.336

As of writing this, there are 10590 projects to be migrated which makes it around ~212 iterations to be run. Each iteration should take around 30ms(25ms query time + 5ms scheduling time) which makes it a total of ~7s.

Background jobs

As already mentioned in the above section, there will be around ~212 background jobs each of which will populate the historical vulnerability statistics for 50 projects for the past 365 days. The background job has already been implemented before and used twice in the past(!40766 (merged)). Therefore the timing analysis will be based on the previous ones.

In total, we will have ~212 jobs to be executed with 5 minutes of delay between each which makes it in total ~18h until all the jobs get finished.


The delay of 5 minutes between each job is chosen based on the following estimations;

  1. Fetching the statistics for a single project

The query to fetch the statistics for a single project for the 10 days window is as following;

SELECT DATE(calendar.entry) AS day,
  severity,
  COUNT(*)
FROM generate_series(
    DATE '2020-07-20',
    DATE '2020-07-29',
    INTERVAL '1 day'
  ) as calendar(entry)
  INNER JOIN vulnerabilities ON vulnerabilities.created_at <= calendar.entry
WHERE "vulnerabilities"."project_id" = 278964
  AND (
    (
      vulnerabilities.dismissed_at IS NULL
      OR vulnerabilities.dismissed_at > calendar.entry
    )
    AND (
      vulnerabilities.resolved_at IS NULL
      OR vulnerabilities.resolved_at > calendar.entry
    )
  )
GROUP BY "day",
  "severity";

https://explain.dalibo.com/plan/FLW (~55ms)

  1. Inserting the statistics for a single project in one query

The following query inserts the statistics for a single project for the past 365 days;

INSERT INTO "vulnerability_historical_statistics" (
    "letter_grade",
    "created_at",
    "updated_at",
    "project_id",
    "total",
    "critical",
    "high",
    "medium",
    "low",
    "unknown",
    "info",
    "date"
  )
VALUES (
    1,
    '2020-07-30 07:23:26.813637',
    '2020-07-30 07:23:26.813637',
    10,
    1,
    0,
    0,
    0,
    1,
    0,
    0,
    '2020-05-01'
  ),
  ...,
  (
    1,
    '2020-07-30 07:23:26.813637',
    '2020-07-30 07:23:26.813637',
    10,
    91,
    0,
    0,
    0,
    91,
    0,
    0,
    '2020-07-30'
  ) ON CONFLICT DO NOTHING
RETURNING "id"

https://explain.depesz.com/s/Eael (~350ms)

Based on these numbers, the total can be calculated with the following formula;

ETA = X1 * 50

  • X1 = Time spent on one project = X2 + X3
  • X2 = Time to fetch all statistics for a single project = X21 * 38
    • X21 = Time to fetch statistics for a single project for 10 days window = 55ms
  • X3 = Time to insert all statistics for a single project = 350ms

ETA = ((55ms * 38) + 350ms) * 50 = 122s

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Mehmet Emin INAC

Merge request reports