Skip to content

2025-06-10: Job stuck in pending state and not picked up by runners

Job stuck in pending state and not picked up by runners (Severity 1)

Problem: Jobs on GitLab.com are stuck in pending state due to CI/CD runners not being picked up.

Impact: Most jobs executing on GitLab.com Hosted Runners are not being processed.

Causes: Network connectivity issues between the runners and GitLab due to HAProxy CI node being reported as unhealthy.

Response strategy: Attempted mitigation includes updating Chef config and requesting GCP assistance; a change was made that led to jobs being picked up again. We then found that rebooting the HAProxy nodes did fix the connectivity issues and jobs were slowly being picked up again.

Root cuase: An unattended systemd package upgrade on the HAProxy CI nodes caused the networkd service to restart on the nodes, which led to unexpected loss of critical configurations (network routing rules). This prevented the nodes from responding successfully to GCP Load-Balancer health-checks.


This ticket was created to track INC-1699, by incident.io 🔥