Skip to content
GitLab Next
  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • GitLab FOSS GitLab FOSS
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Locked Files
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 1
    • Merge requests 1
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Code review
    • Insights
    • Issue
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.org
  • GitLab FOSSGitLab FOSS
  • Issues
  • #21480
Closed
Open
Created Aug 26, 2016 by Mark Pundsack@markpundsackContributor

`parallel` job keyword to speed up pipelines

Problem to Solve

The speed of builds is an important factor for any team, and running tests tends to take a big chunk of the time for any build. Providing a framework to simply parallelize tests will allow these teams to accelerate their software delivery process.

Description

We have split gitlab-ce's tests into multiple parallel jobs running substantially the same scripts which differ only by a loop index. Let's formalize this approach and create a parallel keyword which takes a number, N, and duplicates a job N times while setting CI_NODE_INDEX and CI_NODE_TOTAL for each job.

Proposal

Given:

rspec:
  stage: test
  parallel: 20
  script:
    - export KNAPSACK_REPORT_PATH=knapsack/rspec_node_${CI_NODE_INDEX}_${CI_NODE_TOTAL}_report.json
    - cp knapsack/rspec_report.json ${KNAPSACK_REPORT_PATH}
    - knapsack rspec

Generate 20 jobs named rspec 1/20 through rspec 20/20. (I prefer indexing from 1 for human-named items.). Each job would have a unique CI_NODE_INDEX and CI_NODE_TOTAL would be set to 20. This would be handled at the parser level so GitLab runner wouldn't require any changes.

Note that .gitlab-ci.yml would support multiple definitions for parallel jobs (e.g. rspec and spinach) in the same script, and the CI_NODE_INDEX variables would only be unique within each definition. e.g. there would be two jobs running with CI_NODE_INDEX=1.

Links

  • This is a specific proposal of the general, larger issue of automatic parallelization (#3819 (closed)).
  • Works well with #21286 (closed).

/cc @ayufan @grzesiek

Edited Oct 30, 2018 by Jason Yavorska
Assignee
Assign to
Time tracking