Database Persisted CI Templates and Dynamically Generated CI YAML

This is a proposal to change how we define and store CI templates.

Implementing this proposal would:

  • Simplify the process of searching for, including, and configuring CI templates
  • Allow organizations to create instance- and group-wide CI template policies
  • Promote a consumer-driven usage model of GitLab-hosted CI templates

Current Problems

1. Including and configuring templates is hard

In order for a user to include and configure our CI templates, the user must

  1. Be made aware of the template
  2. Read documentation on how to:
    • Include the template
    • Configure the template
  3. Modify their .gitlab-ci.yml to include the template
    • Continuing to use GitLab documentation as a reference

It would be much simpler if CI templates (built-in or published by users):

  • Were easily discoverable and searchable from within GitLab
  • Had dynamic UI configuration forms that had embedded documentation
  • Did not require code changes in order to be included into a project

We are directly experiencing this problem within the Secure sub-department. We are frequently finding ourselves implementing custom configuration pages for our built-in CI templates so that they can be easily included into projects and configured through the UI.

A generic approach to dynamically including and configuring CI templates could save a lot of time and effort.

2. Deploying template inclusion policies is hard. E and enforcing them is impossible

It is hard to deploy a template inclusion policy in an large organization. If all projects don't already include a common CI configuration, each project must be individually updated to include the configuration.

Even if all projects currently include a common CI configuration, it is impossible to enforce its inclusion. Projects can easily forget to include or intentionally not include the organization's CI template.

Additionally, knowledge about this policy currently is dependent on the organization clearly communicating its policy to its team members.

It would be great if CI configuration policies:

  • Could be defined and enforced at the instance and group levels
  • Were directly visible and accessible within GitLab
    • In other words, viewing the settings/details of a project also shows enforced CI policies
3. GitLab doesn't support consumer-driven usage models well

To me, GitLab has a successful focus on the user. GitLab focuses on the users who are in their groups and projects doing work, mostly in a self-sufficient manner. These users consume services and data either from inside their own organization, or from outside of GitLab: official package managers, discovering related projects, etc.

It is very hard as a user to discover and consume resources from external parties within GitLab. On the opposite side of consumption, it is equally hard as a user to publish and make resources visible to other external users.

In the context of this proposal, and with our CI/CD system being a main feature of GitLab, it would be great if users could easily:

  • Publish CI templates at the public, instance, or group levels
  • Discover published CI templates via searching/automatic recommendations
  • Include and configure published CI templates into a project

Proposed Implementation Details

Explicitly declared job configuration options (a typed settings field)

A settings field could be defined at the job or global level within a ci.yml file that provides the following for each configuration option available to the user:

  • type information (int, string, url, choice, etc)
  • a human-readable name or label
  • a description/documentation
  • additional metadata (optional, required, etc)
  • default value

The settings field would describe the public interface to the job or pipeline that the user should provide values for. The values provided by the user for each setting would become normal environment variables within the CI pipeline when each job is executed.

Below is an example:

job:
  settings:
    ENDPOINT:
      type: text
      name: "Endpoint"
      description: "The endpoint to scan"
      required: true # not possible to set a default value. It is *REQUIRED* that
                     # the user define this field
    SCAN_TYPE:
      type: choice
      name: "Scan Type"
      choices:
        - QUICK
        - DIRTY
        - INEFFICIENT
        - AMAZING
        - THE_BEST
      default: THE_BEST
  script:
    - echo "$ENDPOINT"  # user-provided text
    - echo "$SCAN_TYPE" # user-provided option from the defined choices

The difference between public and private details used by CI templates is important. Currently the user has no way of knowing (apart from documentation) which variables amongst all of the defined variables in an included CI template are intended to be configured. Having a structured way to explicitly declare and document the interface to the template clears up this confusion.

Settings field types

Basic types that we might want to support:

text int float bool choice URL regex

After the basic implementation is complete, more complex, context-aware types could be added, such as:

git ref/branch git branch project username

With these data types being explicitly defined, we would be able to define frontend components specific to these types that help the user enter correct values.

Complex Data Types (additional notes)

I'm on-the-fence about this idea, since it breaks direct compatability with normal variables usage. With variable usage being the primary way that CI jobs/pipelines have always accepted input, I see this method adding complexity and potential problems to GitLab and users.

That being said, I feel it is worth it to describe this idea.

Complex data types could also be allowed in the settings fields. A complex data type would result in a variable whose value is JSON.

Types such as lists:

job:
settings:
  URL_LIST:
    name: list
    type: url

# would produce a variable `'URL_LIST=["http://url1","http://url2"]'

and nested objects could be allowed:

job:
settings:
  CHAIR:
    name: chair
    description: A description of a chair
    type: object
    fields:
      num_legs:
        name: Number of legs
        type: int
        default: 4
      wood_type:
        name: Wood Type
        type: choice
        choices:
          - MAHOGANY
          - IRON_WOOD
          - OAK

# would produce a variable `'CHAIR={"num_legs": 4, "wood_type": "oak"}'`

If this were supported, I would also recommend providing the fully-resolved JSON to each job in a file and/or environment variable:

job:
script:
  - cat $CI_SETTINGS_FILE # file containing the fully-resolved JSON settings
  - echo $CI_SETTINGS     # environment variable containing fully-resolved JSON settings
Dynamically created configuration forms from settings data

Configuration forms could be dynamically created from metadata contained within the settings field described in the previous section.

For example, the following YAML:

job:
settings:
  ENDPOINT:
    type: text
    name: "Endpoint"
    description: "The endpoint to scan"
    required: true # not possible to set a default value. It is *REQUIRED* that
                   # the user define this field
  SCAN_TYPE:
    type: choice
    name: "Scan Type"
    choices:
      - QUICK
      - DIRTY
      - INEFFICIENT
      - AMAZING
      - THE_BEST
    default: THE_BEST

Could be rendered dynamically as:

image

Typed Components

Having distinct components in the frontend to render specific types will allow us to assist the user in providing correct values.

We can enforce only valid URLs to be used, valid regular expressions, or more context-aware types such as issue links, git refs, or usernames. Each frontend component would be able to be separately improved upon.

Allow CI templates to be published

This topic deserves its own discussion issue. To frame where this sub-topic is going, let's talk about a minimal implementation and a feature-rich implementation.

Note that we are talking in this section specifically about publishing a CI template, not consuming published templates.

Minimal Implementation

At a minimum, "publishing" a CI template could be something that GitLab does with its own built-in CI templates. In this minimal case, "publishing" simply means that metadata about the includable CI template is persisted in the database. Users would not have the ability to publish CI templates.

Information about an includable, built-in CI template that is persisted in the database would need its own table (perhaps ci_published_templates) and would likely need to include:

field type notes
scoped_namespace_id int The scope (public/instance/group) that the template belongs
project_id int Project that contains the CI yaml
ref text The ref of the project
file_path text The path within the project of the CI yaml to be included
publisher_id int The publisher of the CI template
timestamps ... Basic create/modify timestamps
settings json The cached settings data from the template's CI yaml
description text A description of the CI template

Feature-rich Implementation

In a feature-rich implementation of publishing CI templates, users would be able manage their published CI templates through the UI:

  • View existing CI templates published by the user
  • Publish new CI templates to a namespace that the user is a member of
    • public, instance, or a specific group
  • Release a new version of a CI template
  • Delete published CI templates

This will require a large amount of frontend, backend, and API work. It would be a major effort.

Additional database fields would likely need to be added at this point, including:

  • homepage - CI template publishers may want to link to their own documentation)
  • metrics - how many times the template was included [on public projects only?])
Allow published CI templates to be consumed

Allowing published CI templates to be consumed requires a few steps:

  1. Searching for and including a published template
  2. Providing settings values for the template
  3. Applying all dynamically included CI templates at pipeline-creation-time
    • ** See the next section about instance/group CI policies - Allowing CI policies to be defined

1. Searching for and including a published template

A minimal approach to this could be to only make the built-in GitLab templates discoverable by the user.

A feature-rich approach could include:

  • Robust searching for templates with advanced filters, sorting, etc
  • Automatic project type detection and display of relevant, published templates

Both the minimal and the feature-rich implementations would require frontend and backend work.

2. Providing settings values for the template

Once a to-be-included template has been selected for a project, the user must provide any required settings values before the template can be included in a pipeline. This would be done through a dynamically created form as discussed in the previous Dynamically created configuration forms from settings data section.

The user-provided settings values would need to be persisted in the database. A new table (perhaps ci_included_templates) could be created with the following fields:

field type notes
project_id int Id of the project that is including the template
ci_published_template_id int Id of the included template
settings json JSON data of the settings provided by the user for the template

3. Applying all dynamically included CI templates at pipeline-creation-time

Before running a new pipeline, CI YAML pre-processing stage occurrs that recursively resolves the initial YAML into a fully-resolved version of the YAML.

Applying all dynamically included CI templates that the user has configured through the UI would be another step in the pre-processing stage:

  • includes would be inserted for each dynamically included CI template
    • (using relevant entries in the ci_included_templates table)
  • User-provided settings would be inserted as values within a variables section
Allowing CI policies to be defined at the instance & group levels

Allowing CI policies to be defined at the instance and group levels would be an extension of the project-specific scenario.

A separate table could be made to track dynamically included CI templates that are defined at the instance or group levels:

field type notes
namespace_id int Id of the namespace that is including the template
ci_published_template_id int Id of the included template
settings json JSON data of the settings provided by the user for the template
overrideable boolean If the template is overrideable by child groups or projects

Group and instance-specific configuration pages would need to be created.

When a pipeline runs within a project, the CI YAML pre-processing stage would load all included CI templates (and their defined settings values) from ancestor namespaces. Descendant projects/groups could be allowed to override settings set by ancestor namespaces.

Once all dynamically included CI templates from ancestor namespaces and the project are resolved, the CI YAML pre-processing stage continues as normal.

Viewing Included CI Templates

As a user, it would be very important to be able to view the total included CI templates for a project. This could be done via:

  • The UI
    • show a list of included CI templates and their settings, and which namespace/group caused the inclusion
  • Build artifact (ish?)
    • The fully-resolved YAML could be persisted and viewed
    • Some form of this would be important so that repeatable builds can occur when a job in a pipeline is retried. This should have the same policy that our current CI include code has.

Auto-Summary 🤖

Discoto Usage

Points

Discussion points are declared by headings, list items, and single lines that start with the text (case-insensitive) point:. For example, the following are all valid points:

  • #### POINT: This is a point
  • * point: This is a point
  • + Point: This is a point
  • - pOINT: This is a point
  • point: This is a **point**

Note that any markdown used in the point text will also be propagated into the topic summaries.

Topics

Topics can be stand-alone and contained within an issuable (epic, issue, MR), or can be inline.

Inline topics are defined by creating a new thread (discussion) where the first line of the first comment is a heading that starts with (case-insensitive) topic:. For example, the following are all valid topics:

  • # Topic: Inline discussion topic 1
  • ## TOPIC: **{+A Green, bolded topic+}**
  • ### tOpIc: Another topic

Quick Actions

Action Description
/discuss sub-topic TITLE Create an issue for a sub-topic. Does not work in epics
/discuss link ISSUABLE-LINK Link an issuable as a child of this discussion

Last updated by this job

Discoto Settings
---
summary:
  max_items: -1
  sort_by: created
  sort_direction: ascending

See the settings schema for details.

Edited by Lucas Charles