Skip to content

Discussion: CI inputs and interpolation strategy and future vision

Note on scope of this issue: The intention of this issue is to discuss how inputs how interpolated, and the impact of that implementation on our feature plans. It is not to discuss alternative ways of defining inputs (from a file, in the UI, etc)

Expected outcome: Creating issues for the chosen next steps so we can fix input types not being respected and implement array type inputs

Objectives

  1. Objective 1: Have inputs be usable in all places that variables can be used, and be able to have any value variables can have
    1. Hash interpolation supports this now, except that number and boolean types are not being respected
  2. 💪 Objective 2: Make inputs more powerful than variables
    1. Add array type: gitlab-org/gitlab#407176 (closed)
      1. Question: Do we want to require users to type the items in an array input? Will we allow multi-type arrays?
    2. Add hash type:
      1. Question: We do want these, right? Will we enforce typing of subkeys in the hash? Will the input definition specify the entire hash structure?
      2. If we want this, let's make an issue for it
    3. Support inputs with !reference : gitlab-org/gitlab#424481
      1. Text interpolation supports this. Hash interpolation does not
      2. Note: This may not be necessary: gitlab-org/gitlab#424481 (comment 1751780425)
    4. Have a way to use inputs to selectively include or not include CI keywords: gitlab-org/gitlab#438771 (closed)
  3. 🛠 Objective 3: Ensure that inputs can be used in an isolated and consistent manner
    1. Expand variables before using them as input values: gitlab-org/gitlab#438723
    2. Ensure that input types are respected: gitlab-org/gitlab#434826 (closed)
    3. Maybe: Provide a function that makes it easier to inject inputs into scripts without worrying about quoting issues gitlab-org/gitlab#407556 (comment 1636396937)
  4. 🔒 Objective 4: Ensure the implementation does not unnecessarily open us or our users up to attack
    1. Question: In text interpolation, it's possible to use string inputs to manipulate the content of the YAML file. Are we okay with that? gitlab-org/gitlab#439272 (comment 1751952462)
    2. Question: Are we planning to try to use inputs with compliance pipelines in a way that does not allow users to change the content of the pipeline?

Options

Hash interpolation

This is the first iteration of CI interpolation. It's currently being used in the CI Catalog and CI inputs Betas.

How it works

  1. An included YAML file is parsed into a Ruby hash
  2. Each node of the hash is scanned for interpolation blocks.
    1. If one is found, it's replaced with the corresponding input
    2. This operation uses gsub

Challenges

  1. Non-string types are not being respected
  2. We don't yet know how or if to implement structure types
  3. We don't know yet how or if to make !reference work (if we decide that is a necessity)

Possible next steps

  1. Fix non-string input types: gitlab-org/gitlab#439272 (comment 1751757103) contains a possible fix
  2. Spike on including array and hash type inputs
  3. Spike on using !reference

Text interpolation

This is a proposed second iteration of CI interpolation. It's currently implemented behind the (disabled) ci_text_interpolation feature flag.

How it works

  1. An included YAML file is scanned in its entirety for interpolation blocks ( gsub )
  2. Input values are cast to JSON (except string types, which are more complicated)
  3. The blocks are replaced by the input values ( gsub )
  4. The YAML is parsed into Ruby

Challenges

  1. Because the YAML is parsed after interpolation, string inputs can be used to manipulate the included YAML in possibly unexpected ways
    1. This could prevent us from using inputs to strengthen the immutability of compliance pipelines
  2. String inputs need to be treated as a special case because casting them to JSON sometimes adds unwanted quotes

Possible Next steps

  1. Implement a "best attempt" fix for string type interpolation: gitlab-org/gitlab!143078 (closed) (with some updates)
    1. When a string input meets the following conditions:
      1. It might be parsed as a non-string
      2. It is surrounded by whitespace
    2. Then surround it with quotes. Otherwise leave it unquoted
    3. Update the docs to provide examples of workarounds for this problem
  2. Add functions that can be used to control the quoting of string inputs. One option might be to have a shellescape function that makes it easier to insert strings into shell commands. We could also have quote and unquote functions
  3. Determine whether YAML manipulation is a security concern
    1. Particularly in the case of compliance pipelines

???

Is there a better way to go about this? Is there a point at which we'd like to explore replacing YAML with something easier to interpolate/manipulate? 😱 Not sure it's the right time to jump into this hairy discussion, but felt like I should mention it here

Decision time!

We'll move ahead with the plan discussed in #127 (comment 1755423243)

Implementation Table

Issue Blocked?
Backend: Pipeline boolean and number include in... (gitlab-org/gitlab#434826 - closed) No
Backend: CI interpolation with arrays (gitlab-org/gitlab#407176 - closed) By Backend: Pipeline boolean and number include in... (gitlab-org/gitlab#434826 - closed)
Backend: Implement support for "null" value to ... (gitlab-org/gitlab#440468) By Backend: Pipeline boolean and number include in... (gitlab-org/gitlab#434826 - closed)
Remove CI text interpolation code (gitlab-org/gitlab#440667 - closed) By Backend: CI interpolation with arrays (gitlab-org/gitlab#407176 - closed)
Edited by Avielle Wolfe