MR pipelines page loading spinner stuck when pipeline creation takes longer than expected

Description

Summary

When the ci_pipeline_creation_requests_realtime feature flag is enabled, the merge request pipelines page loading spinner and skeleton loader can get stuck indefinitely on projects where pipeline creation takes long time to complete. The pipeline is created successfully, but the UI remains in a loading state because the WebSocket subscription never receives the status update.

Root Cause

The ciPipelineCreationRequestsUpdated GraphQL subscription is not receiving the SUCCEEDED or FAILED status update from the backend when pipeline creation completes on larger/more complex projects. The subscription correctly receives the initial IN_PROGRESS status, but the completion message never arrives via WebSocket.

Steps to Reproduce

  1. Enable the ci_pipeline_creation_requests_realtime feature flag on a project with complex CI/CD configuration (many includes, extends, or jobs)
  2. Navigate to a merge request's pipelines tab
  3. Open browser DevTools > Network > WS tab and filter for cable
  4. Click "Run Pipeline"
  5. Observe:
    • WebSocket receives message with status: "IN_PROGRESS"
    • Skeleton loader and loading spinner appear
    • Pipeline is created (visible in pipelines table after polling)
    • No WebSocket message with status: "SUCCEEDED" is received
    • Loading spinner remains indefinitely

Expected Behavior

  • WebSocket should receive the SUCCEEDED status update when pipeline creation completes
  • Loading spinner should disappear
  • Skeleton loader should be replaced with the actual pipeline row

Technical Details

  1. When "Run Pipeline" is clicked, MergeRequests::CreatePipelineService#execute_async is called
  2. A pipeline creation request is stored in Redis with IN_PROGRESS status
  3. GraphqlTriggers.ci_pipeline_creation_requests_updated is called, triggering the WebSocket subscription
  4. The frontend receives the IN_PROGRESS status and displays the skeleton loader
  5. Pipeline creation happens asynchronously in MergeRequests::CreatePipelineWorker
  6. When complete, Ci::PipelineCreation::Requests.succeeded updates Redis status to SUCCEEDED
  7. GraphqlTriggers.ci_pipeline_creation_requests_updated is called again in create_merge_request_pipeline
  8. Issue: On larger projects, this second WebSocket message never reaches the frontend
  9. The pipelines.json polling eventually fetches the new pipeline and adds it to the table
  10. But the loading spinner persists because it depends on the WebSocket response to clear

Environment

  • Reproduced on GitLab.com (gitlab-org/gitlab project)
  • Feature flag: ci_pipeline_creation_requests_realtime (introduced in !207190 (merged))
  • Only occurs on projects where pipeline creation takes longer (complex CI configurations)
  • Works correctly on smaller projects where pipeline creation is fast

Possible Investigation Areas

  1. WebSocket connection timeout: The WebSocket connection may be timing out or being closed before the completion message is sent on slow pipeline creations
  2. Redis pub/sub timing: The subscription trigger may be happening before the WebSocket subscription is fully established for the updated data
  3. ActionCable channel subscription: The channel may not be properly subscribed to receive updates after the initial message
  4. Load balancer/proxy timeout: Network infrastructure may be closing long-lived WebSocket connections

Related Issues

  • Feature flag MR: !207190 (merged)
  • Feature issue: #568346 (closed)
  • Rollout issue: #576639 (closed)
Assignee Loading
Time tracking Loading