Skip to content

Backend: Expand CI variables lazily and selectively

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

In GitLab the only time we need to build the full set of CI variables is when sending them to the runner as we assign a job to execute. In all other scenarios we use variables to expand strings potentially containing other variables.

To expand even a single variable in a string we build a lot of variables.

job:
  script: echo
  rules:
    - if: $CI_COMMIT_BRANCH 

To expand $CI_COMMIT_BRANCH we first build all variables that can potentially be used in that given context.

Can we lazily build only CI_COMMIT_BRANCH and memoize its value?

Idea

A general idea would be to have:

  • a hash of GitLab-defined variables (mostly our predefined variables) for fast access.
  • a lazy evaluation of the variables persisted as AR models (e.g. project variables, group variables, etc.). For example: a lookup by key.

The hash for GitLab-defined variables could be something like:

Ci::Variables::Builder::Static # variables defined statically by GitLab
  VARS = {
    'CI_PROJECT_PATH'        => ->(context) { context.project.full_path },
    'CI_PROJECT_DESCRIPTION' => ->(context) { context.project.description },
    'CI_PIPELINE_SOURCE'     => ->(context) { context.pipeline.source.to_s },
    # ...
  }

  def initialize(context)
    @context = context
    @cache = {}            # cache already computed values.
    @all_evaluated = false # whether all values have been evaluated and cached.
  end

  def [](key)
    return @cache[key] if @cache.key?(key)

    VARS.fetch(key).call(context).tap do |result|
      @cache[key] = result
    end
  end

  def to_hash
    return @cache if @all_evaluated

    VARS.each { |key, _| self[key] }
    @all_evaluated = true

    @cache    
  end
end

ExpandVariables rather than taking in input a fully evaluated list of variables could take in input a variables builder which would lazily evaluate only the variables needed. In fact we could expose a method in Variables::Builder to expand variables and use ExpandVariables logic internally:

builder = Ci::Gitlab::Ci::Variables::Builder.new(pipeline)
builder.expand_variable(text)
class Ci::Gitlab::Ci::Variables::Builder
  def expand_variable(text)
    ExpandVariables.expand(text, context)
  end

  private
  
  def context
    Context.new(pipeline)
  end
  strong_memoize_attr :context
end

Dependencies

This issue may likely depend on Backend: Change the priority of all predefined ... (#388961 - closed). Having a separation between GitLab-defined variables and user-defined variables, where user-defined variables cannot override GitLab-defined variables would simplify the design as well as clarifying the differences:

  • GitLab-defined variables must not contain overlapping keys. This is why representing them as a hash would be a better and more explicit design choice.
  • user-defined variables can override each other depending on the variables precedence.

Challenges

Today predefined variables can be overridden by user-defined variables (project/group secrets, pipeline variables, trigger variables, schedule variables, etc.). It's a strange design choice since we can't guarantee that certain variables are really statically defined and users cannot have hard expectations on the values either.

Edited by 🤖 GitLab Bot 🤖