Persistent, shared work areas between pipeline jobs
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
*This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.*
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
### Problem to solve
GitLab CI/CD was built with many fundamental primitives in mind. Docker first, ephemeral build environments, repeatable and verifiable build steps have guided many of the architecture decisions of GitLab CI/CD. These principles are directly in line with our north star of [speed and scalability](https://about.gitlab.com/direction/ops/#speedy-reliable-pipelines). However, this focus on ephemeral environments does have a few drawbacks. Today, GitLab CI/CD provides a few methods to pass data and information between jobs and stages. [Cache](https://docs.gitlab.com/ee/ci/caching/) and [artifacts](https://docs.gitlab.com/ee/ci/yaml/#artifacts) both have use cases for sending files, data, and information to other jobs. However, caching is a "best effort" layer, and artifacts are much more substantial than is needed for solving many use cases - they are persisted until expired, which can be a long time.
For advanced users of GitLab CI/CD, this leaves a missing middle ground for sharing build environment data, variables, and intermediate build artifacts between jobs and stages. Even for beginners, though, our ephemeral environments can be confusing when getting started with CI/CD. First there's the surprise that your working directory disappears between jobs; once discovered, users naturally want to use caching, but that is not guaranteed; so when it's not available, their pipelines die. Artifacts are durable and are suitable for passing information between sequential jobs, but have unintended side-effects such as persistence way beyond the pipeline run, and the overhead of publishing and downloading artifacts in-between jobs.
### Use cases
Providing a solution here could potentially help with a number of use cases, depending on the solution:
- Have a repo that generates a lot of artifacts (GBs) that using artifacts is too slow
- Have a repo that generates large artifacts (100mb - 1Gb) that using artifacts is too slow
- Have a repo that generates files with secrets embedded (Terraformfiles) that prevents uploading as artifacts.
- Internal implementations of features like SAST/Auto DevOps: https://gitlab.com/gitlab-org/gitlab-ee/issues/10479
- Sharing environments between jobs generally, but potentially also within the same stage
- An alternative to containers for "set up your build environment once and reuse it" (depending on your pipeline strategy)
- Monorepos with high levels of dependency sharing in a single project
- Sharing multiple sources to a single upstream (as in a DAG), i.e. https://gitlab.com/gitlab-org/gitlab/-/issues/32814 and https://gitlab.com/gitlab-org/gitlab/-/issues/20686
### Proposal
There are two primary proposals currently in flight, and we are looking (ideally) to merge these into one that takes the best of both if possible. The two are:
1. Shared workspaces that are passed from job to job but scoped to the lifetime of a pipeline based on the implicit (or explicit, in the case of DAG) dependency tree: https://gitlab.com/gitlab-org/gitlab/-/issues/29265
1. Sticky runners (sticky in the sense of something analogous to a sticky load balancer) that will allow a single runner to manage a series of jobs without recreating the workspaces: https://gitlab.com/gitlab-org/gitlab/-/issues/17497
An example of combining these could be something like:
1. Implement shared workspaces to allow for creating and archiving a pipeline lifetime-scoped artifact, that is automatically inherited based on `needs` or the normal rules for `artifacts` (i.e., later jobs get it for "free").
2. Add an optimization rule such that if the next job is known to be on the same runner host, that the upload is done (in case some other job needs it later) but the workspace is left as-is. This would speed things up a bit by eliminating the need to recreate the work area.
3. Work with the team working on the service daemon (https://gitlab.com/gitlab-org/gitlab/-/issues/19435) to optimize the runner job allocations to maximize the instances where item 2 above happens.
The advantage of something like this would be you could still have the nice workflow syntax/paradigm, but for an example project (game development?) with large intermediary assets, you could avoid large uploads/downloads between every step by asking GitLab to persist a single execution environment.
epic