Implement soft rate limiting for SBOM Scan Processing
Why are we doing this work
To prevent abuse of the SBOM Scan API and maintain service quality for all users, we need to implement a soft rate limiting mechanism at the service layer. This will ensure that heavy usage by individual projects doesn't impact the shared processing capacity and cause delays or timeouts for other users.
The current implementation processes all SBOM scans with the same priority and queue. This could lead to resource contention when projects exceed reasonable usage thresholds or prevent heavy users to continue using the feature beyond such threshold.
Relevant links
- Related issue: #542831 (closed)
- Parent epic: &17150 (closed)
Non-functional requirements
-
Performance: Maintain fast processing for normal usage while gracefully degrading for heavy users -
Feature flag: n/a -
Documentation: n/a -
Testing: Comprehensive specs for rate limiting logic and queue routing
TODO: check potential impact on metrics and the possibility to flag throttled scans to distinguish them and prevent considering this an undesired performance degradation.
Proposed behavior
Normal usage (under threshold):
- Scans processed on high-priority queue (
sbom_scans) - Fast processing with
:highurgency - Standard API responses
Heavy usage (over threshold):
- New scans routed to throttled queue (
sbom_scans_throttled) - Lower urgency (
:low) with higher concurrency limits - API returns rate limit headers
- Client displays warning about increased processing time
Implementation plan (WIP)
MR 1: Fixed soft rate limit
-
Implement soft rate limiting logic, hardcoded using Gitlab::ApplicationRateLimiter -
Create a ProcessSbomScanThrottledWorkerdoing the same operation asProcessSbomScanWorkerbut with a:lowurgency and a higher concurrency (35) -
Implement logic to route scans to throttled queue when threshold is exceeded -
Add Retry-Afterrate limit header to the upload API response to inform the client when throttling occurs
MR 2: Analyzer Client
-
Update client to read the Retry-Afterrate limit header from upload response -
Add logging/warning messages for throttled scans expliciting the impact and the Retry-After value
MR 3 (optional): Configurable thresholds
-
Implement configurable thresholds for Sbom Scan limits (e.g., 50 scans per 10 minutes) -
Implement configurable concurrency limit sidekiq workers
See previous plan
MR 1: Rate Limiting Infrastructure
-
Implement soft rate limiting logic (maybe add to Gitlab::ApplicationRateLimiter) -
(TBD) Implement configurable thresholds for Sbom Scan limits (e.g., 50 scans per 10 minutes)
MR 2: Queue Routing Logic
-
Create ProcessSbomScanThrottledWorkerwith lower urgency and higher concurrency -
Implement logic to route scans to throttled queue when threshold exceeded
MR 3: API Response Enhancement
-
Add rate limit headers to API responses ( X-RateLimit-Limit,X-RateLimit-Remaining,Retry-After) -
Update API documentation with rate limiting behavior
MR 4: Analyzer Client
-
Update client to read rate limit headers and increase timeout -
Add logging/warning messages for throttled scans
Configuration
We can possibly add admin settings to configure:
- the rate limit threshold
- the throttled queue concurrency limit
- the rate limit window duration
This will give more flexibility to self-managed instance admin to adjust these settings based on their respective infrastructure. This could also be done as a follow up improvement.
Verification steps
- Configure rate limiting thresholds in admin settings if applicable, or manually adjust :dependency_scanning_sbom_scan_api_throttling
thresholdandintervalvalues in theApplicationRateLimiter. - Create multiple SBOM scans for a single project within the time window
- Verify first N scans are processed normally on high-priority queue
- Verify subsequent scans are routed to throttled queue with appropriate headers
- Verify rate limit resets after the configured time window
- Test that other projects are unaffected by one project's heavy usage
Benefits
- Maintains availability: No hard limits, heavy users can still use the feature
- Preserves service quality: Prevents resource contention affecting other users
- Transparent communication: Clients are informed about throttling via headers
- Configurable: (optional) Admin can adjust thresholds based on infrastructure capacity
- Gradual degradation: Performance degrades gracefully rather than failing