Shared Runner Managers are possibly underprovisioned
Shared runner managers need to be scaled up. On one host, for example, tasks spend 1.2s waiting for a CPU for every second that they run.
CPU is pinned near 100% for much of the day.
Load average 15 is around 2 per core.
Details
node_schedstat_waiting_seconds_total
on shared-runners-manager-3.gitlab.com
up at 125%.
https://dashboards.gitlab.net/d/ci-runners-main/ci-runners-overview?orgId=1