Limit pipeline job concurrency by resource_group
Problem to Solve
Some pipelines and/or jobs use unique resources or are in some way destructive to an environment. Being able to limit concurrency for them would allow users control over scenarios where there should only be one deploy at a time for an:
- Environment
- Entire Project
- Job (perhaps due to shared testing infrastructure in a testing lab)
Solution
Implementing a generic semaphore for pipeline jobs would be our way to go. We will use Resource groups
- each resource group is essentially a slot (currently limited to one). When multiple jobs need are scheduled to run, the first job locks this "slot" and the rest of the jobs need to wait for the lock to be released. This Resource group will be managed by the GitLab server, there is no development needed for the runner team. The user will need to configure the runner's setting to concurrency 1 for the lock to be complete.
There can be multiple Resource groups
per project, but each one can be run with concurrency 1.
A good example for this is physical devices - so each device would be a resource group but only one job can run at any given time per device.
Sample Configuration
This example moves the lock to a job. Multiple pipelines can run simultaneously, but jobA
will only ever run one at a time, across all pipelines in the project.
stages:
- build
jobA:
resource_group: jobA
stage: build
script:
- echo HelloA
jobB:
stage: build
script:
- echo HelloB
There are some useful patterns for Resource Group
.
-
resource_group: $CI_ENVIRONMENT_NAME
... Limit per environment -
resource_group: $CI_JOB_NAME
... Limit per job -
resource_group: $CI_COMMIT_REF_NAME:$CI_JOB_NAME
... Limit per job per branch -
resource_group: $CI_COMMIT_REF_NAME:$CI_ENVIRONMENT_NAME
... Limit per environment per branch (e.g. review apps)
Proposal
Implementing a generic semaphore for pipeline jobs would be our way to go. We will use Resource groups
to define the lock. In addition the user needs to set the runner configuration to concurrency 1 for the solution to be complete.
Only one job can run on a Resource group
at any given time. Other jobs must wait for the Resource group
to be unlocked before running. The entire logic will be managed by the GitLab server, runners will not need to change.
The concurrency should be 1 by default for this iteration, and cannot be configured at this moment. It'd be a next iteration
What will not be included in this iteration:
limit forward deployments - we will not check the sequence of the pipelines - job b may run before job a even of job b depends on job a. This will be handled in #25276 (closed)
UX Proposal
Purposed changes
- When a job is waiting for a resource group, display an icon indicating this status wherever pipeline graphs are shown. The list can be seen here.
- Hovering this icon should display a tooltip showing the following information:
- Job name
- Job status as waiting for resource
Future Improvements
Implicit locking for environments
Because environments are much more often than not the kind of place where you'd want only one deployment to run at once, and always in the correct order, we will include implicit locking wherever environment:
is used, using a semaphore with the name of the environment.
- When
environment:
is used, it impliesResource Group:
, so you don't need to specifyResource Group:
andenvironment:
, - When
environment:
is used, you can useResource Group: some-name
to create a lock across all environment deployments, - When implict lock is used, you can define
Resource Group: nil
to disable locking, thus run with full concurrency limit, - Implicit lock for the environment comes from the assumption that all deployments are by design not working very well when executed concurrently
Different concurrency behaviors
At the moment, all this will do is wait for a semaphore to free up. You could imagine more possibilities:
concurrency:
parallel: Default current value, job is launch even if an other is in progress
cancel : Cancel job if is launch in parallel of another
wait: Wait previous job is finish for launch current
skip: Skips job, if lock is already acquired
Pipeline-level lock
This example will run only ever one of the project's pipeline's at once. The pipeline itself will run as normal, with all jobs running in parallel in the build stage.
Resource Group: $CI_PROJECT_NAME
# Resource Group: $CI_ENVIRONMENT_NAME for example would give you a way to run one entire pipeline per environment
stages:
- build
jobA:
stage: build
script:
- echo HelloA
jobB:
stage: build
script:
- echo HelloB
Future UX Considerations
In addition to seeing a job is waiting, a user may also want:
- Resource group it is waiting for
- Current job running in the resource group
- An indicator on the job that is currently using the resource
- Position of the job in the queue
- Linking between the jobs to allow a user to navigate to them
These may be accomplished with additional icons, and changes to the tooltip and/or adding this information to the job detail section.
Links
- https://buildkite.com/docs/pipelines/controlling-concurrency
- https://jenkins.io/blog/2016/10/16/stage-lock-milestone/
Technical proposal
TBD
Feature Flag
This feature is implemented behind ci_resource_group
feature flag and disabled by default.
Once we've confirmed the feature is deemed stable, we remove the feature flag in order to publish the feature as GA.