Stagger advisory scanning over time
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem to solve
There is an issue in resource consumption with the current advisory scanning being susceptible to a thundering herd problem. This is a result of there being no restriction on the number of scans that can occur after ingestion. Package Metadata ingestion immediately publishes a scan event after ingesting an advisory.
Proposal
Update advisory ingestion to stagger the scan events with a delay for each subsequent advisory. For example if 10 advisories are ingested and there is a scan delay of a minute, then the 1st advisory scan will be started immediately and the last one will be started 10 minutes later.
Implementation plan
- Change the scanning signal from using the event store to be a simple
AdvisoryScanWorker.perform_in(delay)https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/package_metadata/ingestion/advisory/ingestion_service.rb#L40-
Example code
SCAN_DELAY_INTERVAL=60.seconds publishable_advisories.each_with_index do |advisory, idx| PackageMetadata::AdvisoryScanWorker.perform_in(idx * SCAN_DELAY_INTERVAL) end
-
- Update the AdvisoryScanWorker to remove the event store subscriber concern https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/package_metadata/advisory_scan_worker.rb#L5
- Update
handle_eventto be a simpleperform
- Update
Edited by 🤖 GitLab Bot 🤖