Pending CI jobs time out earlier than the configured time out.
Summary
Pending CI jobs time out earlier than the configured time out.
Steps to reproduce
Using GitLab Community Edition 11.7.5
We have a weekly IT maintenance activity that involves our intranet going offline for a couple hours. The ritual at this point is:
2 hours before maintenance, all gitlab runners are "paused" in the UI. This gives plenty of time for any running jobs to finish. IT does their thing and then unpauses all the runners once the network is online again.
What we have observed on multiple occasions is that all pending jobs that haven't started time out at approximately the time the network goes down:
There has been a timeout failure or the job got stuck. Check your timeout limits or try again
The timeout value in the CI/CD settings page of the repo (under "General Pipeline") is set to 7 days for good measure. None of the pending jobs are even a day old, so they should not be timing out.
The developers owning the pending jobs (myself included) get pretty grumpy about all their pipelines failing. Is there any way to make gitlab respect the timeout setting? Based on the timing, I believe it instantly times out all the pending jobs when the connection to the runner is lost. I can see that the gitlab server itself and all its processes remained online through the maintenance period.
What is the current bug behavior?
GitLab runner jobs time out the instant connection to runner is lost.
What is the expected correct behavior?
GitLab runner jobs remain in pending state for the timeout period specified in the CI/CD settings page.
Relevant logs and/or screenshots
There has been a timeout failure or the job got stuck. Check your timeout limits or try again
I would be interested in workarounds if you can think of any.