For follow-up, fix pending job metric
What we found: The “pending job queue size” metric is clipped/capped because the query that retrieves jobs from the queue is now limiting its output, so the metric is only an estimate and won’t reflect true backlog during high load.
- Why it matters: This makes it harder to accurately track runner health and pipeline execution health during incidents, because dashboards that rely on the Rails-sourced metric are effectively blind when the cap is hit.
- Likely source change: Tomasz traced the behavior to MR [!225079](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/225079) (introduced to address slow DB queries with high pending builds).
- Next steps: Identify an uncapped/alternative signal for backlog (or adjust the metric/query) so incident responders can reliably assess pending CI work; follow up with the owning team since the MR author is OOO.
_This ticket was created from_ [_INC-8376_](https://app.incident.io/gitlab/incidents/8376) _using_ [_incident.io_](https://app.incident.io) 🔥
issue