Skip to content

Record metrics for JobWaiter timeouts

Sean McGivern requested to merge add-metrics-for-jobwaiter-timeouts into master

What does this MR do?

When a JobWaiter times out, the caller doesn't know that all the jobs have completed. This is particularly important for the AuthorizedProjectsWorker, which is called with a timeout of 10 seconds from various controllers.

In order to improve that worker, we first want to track how often these waiters time out, as compared to how often we attempt to wait. Based on the results, we can consider different approaches to making this meet its timeout more frequently.

For gitlab-com/gl-infra/scalability#166 (closed).

Merge request reports