Skip to content

Fix Environment destroy job is retried endlessly

Hunter Stewart requested to merge hustewart-stop-recover-jobs into master

Why

In CI: Environment destroy job is retried endlessl... (#433264 - closed) it's reported that some jobs are being retried endlessly. Still researching but it's possible it's related to recent changes in Environments code to try and automatically recover jobs that are stuck stopping.

What

This is quickly drafted MR to try and identify the places where we enqueued new jobs.

One place is based on a state transition, the other is in a cron job to handle recovering jobs that are stuck stopping for a set amount of time.

Next steps

We still need to identify the exact problem as reported in the issue and determine a resolution. It could be related to something else. This is just the code I was aware of and thought about when I saw the issue. We need to determine more about the cause first.

If this code is the problem we may want to use this as a starting point to stop enqueuing the jobs.

Edited by Hunter Stewart

Merge request reports