Track Secure scans

Review changes
Download
Patches
Plain diff

Cameron Swords requested to merge track_secure_scans into master Jun 22, 2021

Overview 86
Commits 1
Pipelines 23
Changes 12

What does this MR do?

Uses a Snowplow custom event to track every time a Secure scan is run in a CI job.

This partially resolves issue #329157 (closed). The MR to add the Snowplow context schema can be found at https://gitlab.com/gitlab-org/iglu/-/merge_requests/58.

A new worker has been created to isolate as much as possible the tracking of a scan from the way scans are managed/deduplicated/saved to the database. This will help ensure that tracking remains stable going forward while change occurs to other parts of the code base. Of particular importance is that the MergeReportService should not be applied to the report prior to tracking. It is necessary to track exactly what is received in the JSON report, not the result of Rails modifying/normalizing the report.

An idempotency_key has been added to ensure that queries on the analytics can remove duplicate events. (discussed in Slack).

Calculating workload changes to queues

This is a new queue, so I've chosen to model numbers based on the security_scans::store_security_reports queue. This queue also parses Secure JSON reports, however, it runs at the end of the pipeline (as opposed to the end of a build), and parses every JSON in the pipeline (as opposed to just those in the build). I expect security_scans:security_track_secure_scans to fire more often and take less time.

The queue RPS for security_scans::store_security_reports is 0, presumably because of rounding. I've chosen 0.05 as a more useful value. Average execution latency is 1.34 minutes, which I'm assuming is 94 seconds.

security_scans::store_security_reports uses the shard low-urgency-cpu-bound which has an average of 11s total execution time and an average throughout of 24 jobs completed per second.

new_queue_consumption = 0.05 * 94 = 4.7
shard_consumption = 24 * 11 = 264
Increased workload = 4.7 / 264 * 100 = 1.78%

Given these estimates are conservative, the queue is low priority, and increased workload % is below the 5% threshold I don't expect this to be an issue.

Does this MR meet the acceptance criteria?

Conformity

I have included changelog trailers, or none are needed. (Does this MR need a changelog?)
I have added/updated documentation, or it's not needed. (Is documentation required?)
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?)
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?)
I have self-reviewed this MR per code review guidelines.
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines)
I have followed the style guides.
This change is backwards compatible across updates, or this does not apply.

Availability and Testing

I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.)
I have tested this MR in all supported browsers, or it's not needed.
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.

Edited Jul 14, 2021 by Cameron Swords

Merge request reports

Assignee Loading

Reviewers Loading

Request review from

Loading

Time tracking Loading

Loading