Implement soft rate limiting for SBOM Scan Processing

Why are we doing this work

To prevent abuse of the SBOM Scan API and maintain service quality for all users, we need to implement a soft rate limiting mechanism at the service layer. This will ensure that heavy usage by individual projects doesn't impact the shared processing capacity and cause delays or timeouts for other users.

The current implementation processes all SBOM scans with the same priority and queue. This could lead to resource contention when projects exceed reasonable usage thresholds or prevent heavy users to continue using the feature beyond such threshold.

Relevant links

Related issue: #542831 (closed)
Parent epic: &17150 (closed)

Non-functional requirements

Performance: Maintain fast processing for normal usage while gracefully degrading for heavy users
Feature flag: n/a
Documentation: n/a
Testing: Comprehensive specs for rate limiting logic and queue routing

TODO: check potential impact on metrics and the possibility to flag throttled scans to distinguish them and prevent considering this an undesired performance degradation.

Proposed behavior

Normal usage (under threshold):

Scans processed on high-priority queue (sbom_scans)
Fast processing with :high urgency
Standard API responses

Heavy usage (over threshold):

New scans routed to throttled queue (sbom_scans_throttled)
Lower urgency (:low) with higher concurrency limits
API returns rate limit headers
Client displays warning about increased processing time

Implementation plan (WIP)

MR 1: Fixed soft rate limit

Implement soft rate limiting logic, hardcoded using Gitlab::ApplicationRateLimiter
Create a ProcessSbomScanThrottledWorker doing the same operation as ProcessSbomScanWorker but with a :low urgency and a higher concurrency (35)
Implement logic to route scans to throttled queue when threshold is exceeded
Add Retry-After rate limit header to the upload API response to inform the client when throttling occurs

MR 2: Analyzer Client

Update client to read the Retry-After rate limit header from upload response
Add logging/warning messages for throttled scans expliciting the impact and the Retry-After value

MR 3 (optional): Configurable thresholds

Implement configurable thresholds for Sbom Scan limits (e.g., 50 scans per 10 minutes)
Implement configurable concurrency limit sidekiq workers

See previous plan

MR 1: Rate Limiting Infrastructure

Implement soft rate limiting logic (maybe add to Gitlab::ApplicationRateLimiter)
(TBD) Implement configurable thresholds for Sbom Scan limits (e.g., 50 scans per 10 minutes)

MR 2: Queue Routing Logic

Create ProcessSbomScanThrottledWorker with lower urgency and higher concurrency
Implement logic to route scans to throttled queue when threshold exceeded

MR 3: API Response Enhancement

Add rate limit headers to API responses (X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After)
Update API documentation with rate limiting behavior

MR 4: Analyzer Client

Update client to read rate limit headers and increase timeout
Add logging/warning messages for throttled scans

Configuration

We can possibly add admin settings to configure:

the rate limit threshold
the throttled queue concurrency limit
the rate limit window duration

This will give more flexibility to self-managed instance admin to adjust these settings based on their respective infrastructure. This could also be done as a follow up improvement.

Verification steps

Configure rate limiting thresholds in admin settings if applicable, or manually adjust :dependency_scanning_sbom_scan_api_throttling threshold and interval values in the ApplicationRateLimiter.
Create multiple SBOM scans for a single project within the time window
Verify first N scans are processed normally on high-priority queue
Verify subsequent scans are routed to throttled queue with appropriate headers
Verify rate limit resets after the configured time window
Test that other projects are unaffected by one project's heavy usage

Benefits

Maintains availability: No hard limits, heavy users can still use the feature
Preserves service quality: Prevents resource contention affecting other users
Transparent communication: Clients are informed about throttling via headers
Configurable: (optional) Admin can adjust thresholds based on infrastructure capacity
Gradual degradation: Performance degrades gracefully rather than failing

Edited Sep 05, 2025 by Olivier Gonzalez

Assignee Loading

Time tracking Loading