Avoid triggering a re-scan when dependencies haven't changed
Why are we doing this work
Currently, SBOM scans are triggered regardless of whether a project's dependencies have actually changed. This leads to unnecessary resource use and complexity. By implementing mechanisms like change detection at the job and analyzer levels, and artifact caching on the monolith side, we can significantly reduce the number of unnecessary scans while maintaining security coverage.
Relevant links
- Implement soft rate limiting for SBOM Scan Proc... (#561759 - closed) • Olivier Gonzalez • 18.4
- Dependencies persist when supported files are r... (#560331) • Unassigned • 18.6
Problem statement
The current SBOM scanning process:
- Triggers scans on every pipeline run, regardless of dependency changes (e.g. Gemfile.lock updated)
- Processes identical dependencies and generates same results
- Wastes computational resources by triggering the dependency detection in the analyzer and scan in the monolith
Proposal
The old proposal listed template-level optimization on changes detection. This is no longer as easy adding this flag since the v2 template no longer does existence checks. Analyzer dependency change detection would run into problems because even when all components are the same, a new advisory may have been issued which would require a re-scan. Serving a cached report is closer to what we can accomplish with the Sbom Scanning API.
A few options have been considered:
- Give analyzer scan decisions
- Keep scan decisions on the instance side
1. Give analyzer scan decisions
Allow the analyzer to decide whether scanning is necessary at all by being able to send a hash of an sbom and get back whether a cached scan result exists.
Pros
- Minimum IO and resource use on the instance side
Cons
- Complex
- Adds new abstractions just for caching
2. Keep scan decisions on the instance side
Conversely, analyzer changes could be kept to a minimum with caching entirely compartmentalized on the instance side.
Pros
- Simpler implementation
- Uses same abstractions
Cons
- IO use significant
- Caching is opaque to the analyzer and the job user
- Background processing resources still necessary
Conclusion
Proposal 2 is chosen to keep complexity of this feature minimal, retain existing data structures, and iterate faster.
The logical place to add this functionality is in ProcessSbomScanService because it is behind the resource and background job constraints already imposed on scans (throttling, one worker run per build, etc.) and the SbomScan records which have copied scan results, will be subject to the same TTL constraints as those actually scanned.
Old proposal
Implement a multi-layered approach to avoid unnecessary scans:
1. Template-level optimization
- Use
rules:changesinstead ofrules:existsin CI templates - Target dependency-related files (e.g.,
package.json,Gemfile,requirements.txt, etc.) - Skip scan jobs entirely when no dependency changes are detected
2. Analyzer dependency change detection
Detect whether any supported files actually changed in the repo before generating a scannable artifact.
3. Cached report serving
- Return cached security reports for unchanged dependency sets
- Implement cache invalidation based on dependency changes
Requirements
- Vuln Scanning API: If an sbom has been previously scanned as part of another job, do not scan again but rather reuse the scan results
- Non-functional
- Observability data for when cache is hit
- Feature gate on the caching functionality
- A way to tell the analyzer that the scan result came from cache
Implementation plan
Update sbom scan processing to add a check on whether an identical sbom has been scanned. If it has been scanned copy the scan result from the other sbom_scan.
- Migration: update
SbomScanmodel and table to store ansbom_hashandadvisory_db_version - Add scoped lookup to the model to fetch
SbomScanrecords bysbom_hash - Update
::Gitlab::Ci::Reports::Sbom::Reportsto add acomponents_hashimplementation - Update
ProcessSbomScanService- Hash current
SbomScanrecord's sbom - Fetch
SbomScanrecords by this hash - Check if scan results exist
- Identify purl_types in sboms
- For each purl_type fetch
Checkpoint- This will serve the advisory database version
- Create
advisory_db_versionby combining checkpointpurl_typeandsequenceand then combining with all the other purl_types
- Fetch
SbomScanrecords matchingproject_id,sbom_hash,advisory_db_version
- If an existing
SbomScanis found- Create new result file in object storage
- Copy contents from existing sbom scan and store this as the result
- Save caching info
-
sbom_hashandadvisory_db_version - This happens for both conditions, whether scan result is found or it isn't
-
- Hash current
- Other pieces
- Feature flag
- Observability additions
Only one issue is needed, but the work can be broken up into several MRs. Roughly following the main points in the implementation plan above.
Technical considerations
Sbom hash
The sbom hash needs to identify the sbom uniquely using only the attributes that are scannable. This means data under components.
-
Unknown: should qualifiers be considered as part of the sbom hash? (e.g. nuget uses package hashes). At first glance it should only be the attiributes used for scanning (name and version).
Access
Scan result access will be scoped at the project level.
CI job token is used to authenticate sbom uploads. It's currently unlikely but still somewhat unknown whether there's a reason one job shouldn't have access to the scan result of another.
API update
In order to communicate to the analyzer that a cached result is being served we can update the scan result response. It's not currently clear whether this is simple or desirable.
Old technical considerations
Challenges with rules:changes
- Current limitations with CI job skipping in complex scenarios
- Need to handle merge request contexts and branch differences
- Fallback mechanisms for when change detection is uncertain
Artifact caching
- Proper cache keys and keying mechanism to synchronize analyzer and backend
Rollout plan
The rollout of this functionality should be behind a feature flag so that we could roll it out gradually. Adequate testing scenarios should exist before rollout plan is implemented.
Testing plan
TBC