Bulk insert vulnerability scanners during continuous vulnerability scanning
Summary
During continuous vulnerability scans, we sequentially create vulnerability scanners. This is inefficient and can be optimized so that we upsert the scanner objects in batches.
Improvements
- Reduced network calls by batching the
INSERT
statements.
Risks
- We return less rows in the bulk insert than expected. This can be caused by the
BulkInsertableTask
deduplication attribute. To mitigate this we can wait on Add `maps_with` method to BulkInsertableTask (#439608 - closed) to complete, or we can implement a localmaps_with
method to ensure that we access the returned values by a set key.
Involved components
Optional: Intended side effects
- Faster continuous vulnerability scans
Optional: Missing test coverage
Implementation Plan
- Create a new task under
ee/app/services/security/ingestion/tasks/
calledingest_gitlab_vulnerability_scanner.rb
- The task should be unique by external_id and project_id
- The task uses the id (scanner id) and the project id to index the returned values
- The task bulk inserts into the
Vulnerabilities::Scanner
class's table - It inserts the
Gitlab::VulnerabilityScanning::Scanner
scanner attributes with the only difference being theproject_id
.
- Remove the call to
scanner_for_project
(this was making a singular upsert for each finding). - Add the task to the CVS service tasks
- Verify that existing tests pass as expected, and that they cover the following edge cases.
- Singular project finding without the existing scanner
- Singular project finding with the existing scanner
- Multiple project findings with both an existing and non-existing scanner
- Create spec for the task
Edited by Oscar Tovar