Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
GitLab Community Edition
GitLab Community Edition
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
    • Locked Files
  • Issues 13,806
    • Issues 13,806
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge Requests 808
    • Merge Requests 808
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Registry
    • Registry
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • GitLab.org
  • GitLab Community EditionGitLab Community Edition
  • Issues
  • #47063

Closed
Open
Opened May 31, 2018 by Mark Pundsack@markpundsack
  • Report abuse
  • New issue
Report abuse New issue

Directed acyclic graphs (DAG) for pipelines MVC

Description

GitLab CI/CD pipelines are pretty powerful. Sequential stages and parallel jobs provide a lot of configurability to handle a wide variety of needs. But sometimes it's not enough. Or at least, sometimes it's not efficient enough.

For example, when a project generates both Android and iOS apps in a multi-stage pipeline, people want the iOS deployment to start as soon as all the iOS tests pass rather than waiting for all the Android tests to pass too. The total compute time might be the same, but the wall-clock time is different. In more complicated cases, it's possible to significantly reduce the overall wall-clock time of the pipeline by declaring exactly which jobs depend on which other jobs.

A solution like DAG can allow pipelines to be mapped in terms of dependencies, and then cloud compute resources applied automatically in the most efficient way in order to execute. This is very powerful and solves much manual optimization when it comes to pipelines.

Proposal

  1. Add needs keyword to .gitlab-ci.yml.
  2. Needs defaults to the previous non-empty stage so today's behavior is maintained by default. i.e. a job waits for the entire previous stage to succeed.
  3. The needs value can be any job name in the same or previous stages.
    1. Declaring a job in a previous stage means this job can start executing earlier than normal, before the entire previous stage finishes.
    2. Declaring a job in the same stage means this job must start executing later than normal, it must wait not only for the stage to start, but for the specific job in that stage to succeed.
  4. You can’t declare a job in subsequent stage; that should be a parse-time error caught with the linter.
  5. Visually, we still render a pipeline with stages, including in mini-graph version.
    1. In the full pipeline view, show proper dependency links, as simply as possible. It would be nice to have a pretty DAG viewer, but collapsing onto current representations should be sufficient (to start).
    2. Dependencies within a stage are interesting as you likely want to visualize them horizontally in sequence. Maybe that’s still feasible within a stage, and just have them collapsed in mini-graph view.
  6. Declaring a blank needs is valid, and means that job has no dependencies and thus can, and should, start running asap.
    1. This lets you have multiple, parallel pipelines that are independent in one .gitlab-ci.yml, although with Starter, you can include separate declarations if that helps.

Notes / questions

  1. Do we need to support a list of jobs rather than a single job name? Probably eventually.
  2. If a bunch of jobs depend on another job, it slightly sucks to have to declare the needs for many jobs, but that’s part of why it’s not the default. Stages are easier to understand and use.
  3. Perhaps a way to group jobs under something less than a stage would be helpful, so you can say you depend on a bunch of jobs succinctly.
  4. Conversely, perhaps being able to declare a wildcard pattern for job names so you can quickly depend on similarly named jobs (e.g. rspec 1 20, rspec 2 20, etc.)
  5. There's a subtlety here that in order to support multiple independent pipelines, you don't want to wait for all prior stages to succeed, just for the most immediately prior stage to succeed. This way, if a pipeline starts running in the middle, so to speak, subsequent stages will trigger "correctly".
  6. If we let you declare a needs in another project's pipeline, we could solve multi-project dependencies (gitlab-ee#1681 (closed)).

Links / references

  • Prior issue: #41947 (closed)
  • Delayed job issue which would benefit from needs behavior: #51352 (closed)
Edited Jan 03, 2019 by Jason Lenny

Related issues

Assignee
Assign to
Epic
11.11
Milestone
11.11
Assign milestone
Time tracking
None
Due date
No due date
8
Labels
Product Vision 2019 UX Verify continuous integration customer devops:verify direction feature
Assign labels
  • View project labels
Reference: gitlab-org/gitlab-ce#47063