Skip to content

New Asynchronous Commit Status Endpoint

Problem

Currently, commit status creation can take up to 1 minute, and users may hit maximum retries when posting, resulting in rejection. While we've increased the TTL on the lock to prevent duplicate commit statuses, this can mean an update is more likely to be rejected and the client needs to handle that. I'm increasing the sleep between retries here to reduce 409's returned: !208340 but it still means that requests can hang waiting on other requests. It's generally bad for reliability to keep requests open for a long time.

Work that can take up to a minute shouldn't be done inside a web request.

Proposed Solution

Create a new asynchronous version of the commit status endpoint that:

  • Processes commit statuses using Sidekiq with exponential backoff retries
  • Guarantees delivery once the pipeline's SHA is no longer locked
  • Requires clients to fetch status separately after submission

Technical Details

  • Create new API endpoint version
  • Implement Sidekiq background processing
    • Rather than using retries via in_lock we should allow FailedToObtainLockError to raise after the first attempt, triggering sidekiqs retries with exponential backoff. We can set the job to the default of 25 retries which happens over several days.
  • Maintain existing synchronous endpoint for backward compatibility
    • Mark old endpoint as deprecated
    • Continue supporting it indefinitely to avoid breaking changes
  • Provide an endpoint to fetch the new status through polling. (may be the existing commit status list endpoint)
  • We would need to design a way to prevent statuses sent earlier overriding the jobs more current status.
  • Team members had also suggested separating create and update operations for clarity. We could discuss this further before implementing.

Benefits

  • More reliable commit status creation
  • Better handling of long-running status operations
  • Reduced likelihood of client-side failures

Note

The new endpoint will not return the commit status immediately (unlike the current endpoint) as processing will happen asynchronously. This would require consumers to switch to the endpoint so the benefits may not be immediately realized.

Edited by Allison Browne