Detection to fail fast on connection loss
Description
A customer had an incident where the ingress infrastructure sitting between the runners and GitLab was unexpectedly restarted. The upload/download to the coordinator hung for runners that had open connections. This resulted in the jobs timing out after their configured timeout (an hour in this case).
Proposal
Add protection or detection on the runners that we could enable customers to fail fast on connection loss
Links to related issues and merge requests / references
Slack discussion: https://gitlab.slack.com/archives/CBQ76ND6W/p1617805195398400 Customer interest: https://gitlab.my.salesforce.com/00161000004zrG3