Skip to content

Make it possible to define a cartesian product/matrix for build jobs

Problem to Solve

We recently introduced child/parent pipelines (#16094 (closed)) which allows for writing your own code to generate a child/parent pipeline - this is a powerful way to generate any custom behaviors, but it is a bit heavyweight for simpler scenarios where you just want to create jobs for a simple matrix.

Proposal

We will add a matrix keyword to parallel, which is a pre-existing keyword that handles parallelization of jobs:

deploystacks:
  stage: deploy
  parallel:
    matrix:
      - PROVIDER: aws
        STACK: [monitoring, app1, app2]
      - PROVIDER: ovh
        STACK: [monitoring, backup, app]
      - PROVIDER: gcp
        STACK: [data, processing]

The above would generate the following parallel jobs:

  1. deploystacks (PROVIDER=aws; STACK=monitoring)
  2. deploystacks (PROVIDER=aws; STACK=app1)
  3. deploystacks (PROVIDER=aws; STACK=app2)
  4. deploystacks (PROVIDER=ovh; STACK=monitoring)
  5. deploystacks (PROVIDER=ovh; STACK=backup)
  6. deploystacks (PROVIDER=ovh; STACK=app)
  7. deploystacks (PROVIDER=gcp; STACK=data)
  8. deploystacks (PROVIDER=gcp; STACK=processing)

The benefits here are:

  • This keeps the matrix keyword to avoid the issue of being unable to extend it in a reasonable way later: #15356 (comment 309409940). Yes, this is a little less concise than not having the keyword, but it makes the intention very clear.
  • More complex scenarios can still use child/parent pipelines, so we don't have to design this as a all inclusive matrix build specification language.

Additionally, a simpler matrix is also supported for when each entry from the first dimension contains all items from the second:

deploystacks:
  stage: deploy
  parallel:
    matrix:
      - PROVIDER: [aws, ovh, gcp]
        STACK: [monitoring, app1, app2]

This version would generate:

  1. deploystacks (PROVIDER=aws; STACK=monitoring)
  2. deploystacks (PROVIDER=aws; STACK=app1)
  3. deploystacks (PROVIDER=aws; STACK=app2)
  4. deploystacks (PROVIDER=ovh; STACK=monitoring)
  5. deploystacks (PROVIDER=ovh; STACK=app1)
  6. deploystacks (PROVIDER=ovh; STACK=app2)
  7. deploystacks (PROVIDER=gcp; STACK=monitoring)
  8. deploystacks (PROVIDER=gcp; STACK=app1)
  9. deploystacks (PROVIDER=gcp; STACK=app2)

Limitations

  • For the MVC total jobs will be limited to the same number as parallel (currently 50: https://docs.gitlab.com/ee/ci/yaml/#parallel)
  • There is no such thing as a variable containing an array in GitLab today, so setting up a matrix as follows is not possible:
  parallel:
    matrix:
      - PROVIDER: ${MY_PROVIDERS}
        STACK: ${MY_STACKS}

Instead of this, for now to avoid repitition you should use extends:

.matrix
  parallel:
    matrix:
      - PROVIDER: [aws, ovh, gcp]
        STACK: [monitoring, app1, app2]

build:
  extends: .matrix

test:
  extends: .matrix

deploy:
  extends: .matrix

External examples

Example of .travis.yml

rvm:
  - 1.9.3
  - 2.0.0
  - 2.2
  - ruby-head
  - jruby
  - rbx-2
  - ree
gemfile:
  - gemfiles/Gemfile.rails-2.3.x
  - gemfiles/Gemfile.rails-3.0.x
  - gemfiles/Gemfile.rails-3.1.x
  - gemfiles/Gemfile.rails-edge
env:
  - ISOLATED=true
  - ISOLATED=false

This produces 56 individual jobs.

Real life example provided by @MrChrisW:

Travis docs: Build matrix

Also, GitHub CI/CD is planning the same feature: https://help.github.com/en/articles/workflow-syntax-for-github-actions#jobsjob_idstrategy

Related Links

Related issues https://gitlab.com/gitlab-org/gitlab-ce/issues/13755, https://gitlab.com/gitlab-org/gitlab-ce/issues/19198

Availability & Testing

  • Unit test changes - Yes

    • Ensure parallel existing limitations are preserved: required minimum 2 jobs and maximum 50 job instances, and empty arrays are not allowed.
    • Ensure the appropriate job instances are created as specified in matrix
    • Ensure multiple jobs can extends the same matrix and job instances count is accordingly.
  • Integration test changes - not required

  • End-to-end test change - not required

Please see the test engineering planning process and reach out to your counterpart Software Engineer in Test for assistance: https://about.gitlab.com/handbook/engineering/quality/test-engineering/#test-planning

/cc @MrChrisW @markpundsack @ayufan

Edited by Marius Bobin