Skip to content
GitLab
Next
    • Why GitLab
    • Pricing
    • Contact Sales
    • Explore
  • Why GitLab
  • Pricing
  • Contact Sales
  • Explore
  • Sign in
  • Get free trial
  • GitLab.orgGitLab.org
  • GitLabGitLab
  • Issues
  • #15536

Limit pipeline job concurrency by resource_group

Problem to Solve

Some pipelines and/or jobs use unique resources or are in some way destructive to an environment. Being able to limit concurrency for them would allow users control over scenarios where there should only be one deploy at a time for an:

  • Environment
  • Entire Project
  • Job (perhaps due to shared testing infrastructure in a testing lab)

Solution

Implementing a generic semaphore for pipeline jobs would be our way to go. We will use Resource groups - each resource group is essentially a slot (currently limited to one). When multiple jobs need are scheduled to run, the first job locks this "slot" and the rest of the jobs need to wait for the lock to be released. This Resource group will be managed by the GitLab server, there is no development needed for the runner team. The user will need to configure the runner's setting to concurrency 1 for the lock to be complete. There can be multiple Resource groups per project, but each one can be run with concurrency 1. A good example for this is physical devices - so each device would be a resource group but only one job can run at any given time per device.

Sample Configuration

This example moves the lock to a job. Multiple pipelines can run simultaneously, but jobA will only ever run one at a time, across all pipelines in the project.

stages:
  - build

jobA:
  resource_group: jobA
  stage: build
  script: 
    - echo HelloA

jobB: 
  stage: build
  script:
    - echo HelloB

There are some useful patterns for Resource Group.

  • resource_group: $CI_ENVIRONMENT_NAME ... Limit per environment
  • resource_group: $CI_JOB_NAME ... Limit per job
  • resource_group: $CI_COMMIT_REF_NAME:$CI_JOB_NAME ... Limit per job per branch
  • resource_group: $CI_COMMIT_REF_NAME:$CI_ENVIRONMENT_NAME ... Limit per environment per branch (e.g. review apps)

Proposal

Implementing a generic semaphore for pipeline jobs would be our way to go. We will use Resource groups to define the lock. In addition the user needs to set the runner configuration to concurrency 1 for the solution to be complete. Only one job can run on a Resource group at any given time. Other jobs must wait for the Resource group to be unlocked before running. The entire logic will be managed by the GitLab server, runners will not need to change.

The concurrency should be 1 by default for this iteration, and cannot be configured at this moment. It'd be a next iteration

What will not be included in this iteration:

limit forward deployments - we will not check the sequence of the pipelines - job b may run before job a even of job b depends on job a. This will be handled in #25276 (closed)

UX Proposal

Purposed changes

  • When a job is waiting for a resource group, display an icon indicating this status wherever pipeline graphs are shown. The list can be seen here.
  • Hovering this icon should display a tooltip showing the following information:
    • Job name
    • Job status as waiting for resource

MVC

Future Improvements

Implicit locking for environments

Because environments are much more often than not the kind of place where you'd want only one deployment to run at once, and always in the correct order, we will include implicit locking wherever environment: is used, using a semaphore with the name of the environment.

  1. When environment: is used, it implies Resource Group:, so you don't need to specify Resource Group: and environment:,
  2. When environment: is used, you can use Resource Group: some-name to create a lock across all environment deployments,
  3. When implict lock is used, you can define Resource Group: nil to disable locking, thus run with full concurrency limit,
  4. Implicit lock for the environment comes from the assumption that all deployments are by design not working very well when executed concurrently

Different concurrency behaviors

At the moment, all this will do is wait for a semaphore to free up. You could imagine more possibilities:

concurrency:
   parallel: Default current value, job is launch even if an other is in progress
   cancel : Cancel job if is launch in parallel of another
   wait: Wait previous job is finish for launch current
   skip: Skips job, if lock is already acquired

Pipeline-level lock

This example will run only ever one of the project's pipeline's at once. The pipeline itself will run as normal, with all jobs running in parallel in the build stage.

Resource Group: $CI_PROJECT_NAME
# Resource Group: $CI_ENVIRONMENT_NAME for example would give you a way to run one entire pipeline per environment

stages:
  - build

jobA:
  stage: build
  script: 
    - echo HelloA

jobB: 
  stage: build
  script:
    - echo HelloB

Future UX Considerations

In addition to seeing a job is waiting, a user may also want:

  • Resource group it is waiting for
  • Current job running in the resource group
  • An indicator on the job that is currently using the resource
  • Position of the job in the queue
  • Linking between the jobs to allow a user to navigate to them

These may be accomplished with additional icons, and changes to the tooltip and/or adding this information to the job detail section.

Links

  • https://buildkite.com/docs/pipelines/controlling-concurrency
  • https://jenkins.io/blog/2016/10/16/stage-lock-milestone/

Technical proposal

TBD

Feature Flag

This feature is implemented behind ci_resource_group feature flag and disabled by default. Once we've confirmed the feature is deemed stable, we remove the feature flag in order to publish the feature as GA.

Planned MRs

Backend

  • PoC !20450 (closed)
  • Ci Resouce Group models and parser
  • CI Resource Groups Status Transition

General

  • Write a feature spec to test frontend and backend change altogether
  • Remove the feature flag and update documentation # i.e. publish the feature
Edited Jan 06, 2020 by Orit Golowinski
Assignee
Assign to
Time tracking