Bulk insert vulnerability scanners during continuous vulnerability scanning

Summary

During continuous vulnerability scans, we sequentially create vulnerability scanners. This is inefficient and can be optimized so that we upsert the scanner objects in batches.

Improvements

Reduced network calls by batching the INSERT statements.

Risks

We return less rows in the bulk insert than expected. This can be caused by the BulkInsertableTask deduplication attribute. To mitigate this we can wait on Add `maps_with` method to BulkInsertableTask (#439608 - closed) to complete, or we can implement a local maps_with method to ensure that we access the returned values by a set key.

Involved components

Security::VulnerabilityScanning::CreateVulnerabilityService

Optional: Intended side effects

Faster continuous vulnerability scans

Optional: Missing test coverage

Implementation Plan

Create a new task under ee/app/services/security/ingestion/tasks/ called ingest_gitlab_vulnerability_scanner.rb
- The task should be unique by external_id and project_id
- The task uses the id (scanner id) and the project id to index the returned values
- The task bulk inserts into the Vulnerabilities::Scanner class's table
- It inserts the Gitlab::VulnerabilityScanning::Scanner scanner attributes with the only difference being the project_id.
Remove the call to scanner_for_project (this was making a singular upsert for each finding).
Add the task to the CVS service tasks
Verify that existing tests pass as expected, and that they cover the following edge cases.
- Singular project finding without the existing scanner
- Singular project finding with the existing scanner
- Multiple project findings with both an existing and non-existing scanner
Create spec for the task

Edited Jul 10, 2024 by Oscar Tovar