Allow pipelines to schedule delayed job runs
Problem to Solve
Being able to have a pipeline step schedule a task after some delay allows for more complex behaviors to be implemented in the CI yml. There are multiple reasons why you may do this, but typical are:
- Incremental rollout scenarios
- Waiting for other task (with fixed time delay) to complete
By allowing these delays to be implemented directly, you avoid situations such as keeping a runner active running a sleep command for some period.
Solution
We will add a when: keyword to the CI configuration that will allow a job run to be delayed. The delay starts the moment the job would otherwise be invoked, not from the start of the pipeline (unless the job starts at the same time as the pipeline starts.)
sample_job:
when: delayed
start_in: 30 minutes
Behavior on Cancel
- In order to cancel the complete rollout, the user can use "cancel pipeline" as normal and all jobs will be canceled
- If an individual job is canceled other jobs will continue running and the job that was canceled will essentially become a manual task. This behavior may eventually change (by becoming more intelligent and understanding predecessor relationships between jobs) via #47063 (closed). For users who feel strongly that the pipeline should pause in the scenario where a job is canceled (for example - incremental rollouts that should not proceed if one of the jobs is canceled), this can be implemented by putting each job in a separate stage.
- Retry always fires the job immediately, without re-running the delay.
Pipeline scheduled jobs concept
graph LR
PS>Pipeline starts]
PASS(Pipeline arrives at stage <br> that includes scheduled jobs)
PA(Pipeline aborts)
SCD(Starts counting down delay)
SSB(Jobs script begins running)
PS --> PASS
PS -->|Pipeline is cancelled or fails| PA
SJ(Scheduled job)
NJ(Normal job)
MJ(Manual job)
PASS --> SJ
SJ --> SCD
PASS --> NJ
NJ --> SSB
PASS --> MJ
MJ -->|Manual action triggered| SSB
SCD -->|Countdown finishes| SSB
SCD -->|Unscheduled| MJ
PP(Pipeline succesfully finishes)
PA(Pipeline aborts)
SSB -->|All jobs pass| NS
SSB -->|A job fails| PA
SSB -->|A job is cancelled| PA
NS{Next stage?}
NS -->|Yes| PASS
NS -->|No| PP
Mockups
Pipeline lists & Environments list:
Pipeline detail page:
Jobs lists:
Job detail page:
Schedule job empty state
This is a scheduled to run in
00:00:00
This job will automatically run after it's timer finishes. Often they are used for incremental roll-out deploys to production environments. When unscheduled it converts into a manual action.
Button (default grey) Unschedule job
Can we also change the copy of the existing buttons to:
Trigger manual action
Cancel job
And the following illustration, which is included in this merge request gitlab-svgs!129 (merged)
Assets
./node_modules/@gitlab-org/gitlab-svgs/sprite_icons/status_manual.svg
./node_modules/@gitlab-org/gitlab-svgs/sprite_icons/status_manual_borderless.svg
./app/assets/images/ci_favicons/favicon_status_manual.png
./app/assets/images/ci_favicons/canary/favicon_status_manual.ico
./app/views/shared/icons/_icon_status_manual_borderless.svg
./app/views/shared/icons/_icon_status_manual.svg
gitlab-svgs!128 (merged) adds appropriate icons to http://gitlab-org.gitlab.io/gitlab-svgs/
- Unschedule action button uses a
time-out
icon - Scheduled job status uses
status_scheduled_borderless
if using a borderless icon
_icon_status_scheduled_borderless.svg
follow up on immediate action to trigger scheduled job => running
Additional edits
Pipeline list
A scheduled action will already be inside the manual action dropdowns even if it is still counting down. Clicking will immediately trigger the scheduled job. This will open up a modal asking if the user is sure.
Run JOBNAME?
Are you sure you want to run JOBNAME immediately? This job will run automatically after it's timer finishes.
Button (default grey) Cancel
Button (default grey) Run job immediately
Note: A decision was made to start this off initially as a browser native modal, as it reduces scope
Job detail page:
Schedule job empty state
This is a scheduled to run in
00:00:00
This job will run automatically after its timer finishes. Often they are used for incremental roll-out deploys to production environments. When unscheduled it converts into a manual action.
Button (default grey) Unschedule job
Button (default grey) Run job immediately