Skip to content

Fail jobs that fail to render registration response

Stan Hu requested to merge sh-requeue-failed-job-register into master

When a runner registers for a build, the JSON payload may require loading a few Gitaly RPCs, which may fail. If these any of these fail, the build will be appear as though it has been assigned to the runner, but the runner will have received a 500 error and not proceed.

To recover from situations like this, we try to render the payload in RegisterJobService. If that fails, we mark the job as failed with a scheduler_failure so that it may be retried again.

This came out of #227215 (closed).

Edited by Stan Hu

Merge request reports