Incubator template for YAML includables that are not yet ready to become templates in their own right

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Close this issue

Release notes

Add template for includable YAML that is tied to an enumerated list of downstream projects that include it.

Problem to solve

I submit that there are four broad stages for includable YAML in GitLab CI/CD.

No include statements at all. A job may be extended several times within one .gitlab-ci.yml, but YAML is "shared" only by ad-hoc copy-pasting across repositories.
??? (non-obvious stage that I claim exists, or rather, should exist)
.job-to-extend is defined in its own repository with its own tests, the test cases being jobs in its own .gitlab-ci.yml that extend .job-to-extend. The job makes certain promises that are verified by its own tests. (The job may also have additional, undocumented functionality that might vanish at any moment.) However, unlike true templates, these repositories may be unstable, and rigor of semantic versioning may vary widely, but users can always version-pin to specific commit hashes if desired.
GitLab CI/CD Templates. You all know these.

It sounds like the process for joining the GitLab Templates is about to change, and I can't parse how. But this isn't about proper GitLab Templates, so I don't think this steps on anything happening there or vice versa.

Certain kinds of jobs can straightforwardly jump to the third stage, like the building of container images discussed in #353334. In certain specific cases like that, the general capability immediately suggests itself, and while customization is obviously required in every case, the customization is obviously and conveniently confined to the Dockerfile.

include:
- project: 'shell-bootstrap-scripts/shell-bootstrap-scripts'
  file: 'build_with_kaniko.yaml'

But most jobs are not like that. I submit that the gap between the first stage and the third stage is a yawning valley of death. I don't really understand the plan for &7462, but I think it's focused on narrowing the gap between the third stage and the fourth stage. Which will definitely be nice to have, but I think most of the "missing" jobs, the templates that are failing to exist, are dying before ever reaching the third stage. If jobs get pushed more smoothly from the third stage of development to the fourth stage, that'll be a nice win, but the vast majority of developers' efforts will still be bottled up in the first stage.

That's the first and most important case I need to make: that there is a gap to be filled here. That there is a missing second stage. We need some kind of bridge or stepping-stone between what I'm calling the first stage of development and what I'm calling the third stage.

If development stays in the first, "copy-paste and season to taste" stage, then it never goes anywhere. It may serve individual projects quite well, but it will never become a general capability.

Even if you agree that there's a wide gap between the first stage and the third, there are multiple possible answers for what should go in between. But I would say that this missing second stage is when shared, includable YAML has its own independent development cadence, with its own independent merge requests, but does not have its own set of abstract test cases that cover its features. Such test cases are uniquely difficult to come up with for CI/CD scripts, not just because good tools for comparing the output are not available, but because we don't really what what the output ought to be except on a case-by-case basis. The third stage is really a very mature stage of development, a place you can only easily reach for capabilities that are particularly easy to abstract, like building container images from Dockerfiles.

But if shared YAML in the second stage does not have its own independent test cases...what pipeline does it actually run on a merge request?

YAML in the second stage can have an enumerated list of downstream projects that include it, and trigger those downstream pipelines for its own pipeline. It may additionally have some of its own independent test cases (which will hopefully increase over time as it moves toward the third stage of development), but for the most part, to evaluate a branch of the shared-included YAML, you run the enumerated list of pipelines that include it.

(If-then-false statements to ensure pipelines fail when they go sideways are, as ever, a good idea.)

That suggests that shared YAML in the second stage is not truly ready for general use. And it's not! YAML in the second stage will not usually be of general interest. (By contrast, in the third stage, anyone can pick up and use a job that e.g. builds container images, as long as they're willing to use bleeding-edge code that's still a bit unstable and janky.)

The point of the second stage is not to put out an initial request-for-comment or any such thing: the point of the second stage is to be very, very easy to reach from the first stage, to the point of being almost automatic. If developers habitually create shared YAML repositories to hold extendable CI/CD jobs, at the push of a button, then they'll spend much less time fiddling with bespoke solutions on each individual repository, and we'll have a much larger pool of repositories that can eventually reach the third stage of development (and, later, the fourth).

Intended users

This is intended to be used very broadly. The point of making it as push-button as possible is precisely so that anyone building a half-dozen similar pipelines --- even if they're not very similar --- will reach for a shared YAML include as a first option.

User experience goal

Since the whole point is to be as push-button as possible, the goal is to reduce the user experience down to a single button to set up the shared includable YAML (though of course, development of the actual job script might go on for months or years). The current implementation has a couple of undesirable stumbling blocks, but we'll get to that.

The other point of user interaction is to edit the enumerated list of downstream pipelines that control the pipeline for the shared YAML. If we must have such an enumerated list --- and I think we must --- then at least editing it should be trivial to do. I think the simplest method is to just keep the list as a regular version-controlled file in the repository. An example is at https://gitlab.com/shell-bootstrap-scripts/sample-shared-gitlab-include/sample-shared-gitlab-include/-/blob/main/project_branches_to_trigger.yaml.

project_branches_to_trigger:
- project: shell-bootstrap-scripts/sample-shared-gitlab-include/including-project-2
  branch: main

Proposal

There is a current working proof of concept: https://gitlab.com/shell-bootstrap-scripts/gitlab-include-cookiecutter https://gitlab.com/shell-bootstrap-scripts/sample-shared-gitlab-include/sample-shared-gitlab-include

A new repository of shared includable YAML can be set up --- in a sense --- with the push of a button at https://gitlab.com/shell-bootstrap-scripts/gitlab-include-cookiecutter/-/pipelines. One example such repository is https://gitlab.com/shell-bootstrap-scripts/sample-shared-gitlab-include/sample-shared-gitlab-include. https://gitlab.com/shell-bootstrap-scripts/sample-shared-gitlab-include/sample-shared-gitlab-include/-/blob/main/project_branches_to_trigger.yaml owns the list of downstream pipelines; editing that file changes the list of downstream pipelines. Editing https://gitlab.com/shell-bootstrap-scripts/sample-shared-gitlab-include/sample-shared-gitlab-include/-/blob/main/includable.yaml itself --- in a separate development branch, preferably --- triggers all those downstream pipelines.

Although it is obviously possible to assemble a pipeline .gitlab-ci.yml on-the-fiy as an artifact file without making any git commits, but the current implementation does make git commits: it creates separate, unprotected branches in the downstream repositories and runs the downstream pipelines on those unprotected branches. If all goes well, those ephemeral branches are cleaned up at the end. If all does not go well, those branches stick around to be inspected (to be automatically cleaned up later when the pipelines finally pass). The design impetus here is that, if anything does go wrong, you can always inspect the actual .gitlab-ci.yml that ran the same way you were normally read it (just on a different branch). It's possible that, with sufficiently rigorous logging of the .gitlab-ci.yml being written, the actual git commits could be dispensed with. (But it would still be nice to have a URL where the YAML could be viewed with syntax highlighting, just as it currently can on a branch.)

The current implementation is aimed at being as seamless as possible from the user's perspective (the user here being someone developing a shared job but not interested in the internals of how the YAML gets shared-and-tested), but the underlying code that makes it happen is a terrible snarl. In addition to the git commits on the downstream pipelines, https://gitlab.com/shell-bootstrap-scripts/sample-shared-gitlab-include/sample-shared-gitlab-include/-/blob/main/project_branches_to_trigger.yaml does its job by actually kicking off a job that edits its own repository. This can be very nice to have if things go wrong, since the current YAML (with all trigger clauses) can be trivially inspected just by looking at it, without needing to chase down an artifact stashed away in the pipeline. But it means the pipelines have a weird "bounce" as one pipeline makes a commit which kicks off another pipeline, which kicks off another pipeline, before things settle down.

So, yeah, interested in better ways to do this.

Further details

Permissions and Security

Documentation

Availability & Testing

Available Tier

Free

Feature Usage Metrics

If implemented as a template, this can use the standard template usage tracking.