Expand CI variables lazily and selectively
Problem
In GitLab the only time we need to build the full set of CI variables is when sending them to the runner as we assign a job to execute. In all other scenarios we use variables to expand strings potentially containing other variables.
To expand even a single variable in a string we build a lot of variables.
job:
script: echo
rules:
- if: $CI_COMMIT_BRANCH
To expand $CI_COMMIT_BRANCH
we first build all variables that can potentially be used in that given context.
Can we lazily build only CI_COMMIT_BRANCH
and memoize its value?
Idea
A general idea would be to have:
- a hash of GitLab-defined variables (mostly our predefined variables) for fast access.
- a lazy evaluation of the variables persisted as AR models (e.g. project variables, group variables, etc.). For example: a lookup by
key
.
The hash for GitLab-defined variables could be something like:
Ci::Variables::Builder::Static # variables defined statically by GitLab
VARS = {
'CI_PROJECT_PATH' => ->(context) { context.project.full_path },
'CI_PROJECT_DESCRIPTION' => ->(context) { context.project.description },
'CI_PIPELINE_SOURCE' => ->(context) { context.pipeline.source.to_s },
# ...
}
def initialize(context)
@context = context
@cache = {} # cache already computed values.
@all_evaluated = false # whether all values have been evaluated and cached.
end
def [](key)
return @cache[key] if @cache.key?(key)
VARS.fetch(key).call(context).tap do |result|
@cache[key] = result
end
end
def to_hash
return @cache if @all_evaluated
VARS.each { |key, _| self[key] }
@all_evaluated = true
@cache
end
end
ExpandVariables
rather than taking in input a fully evaluated list of variables could take in input a variables builder which would lazily evaluate only the variables needed. In fact we could expose a method in Variables::Builder
to expand variables and use ExpandVariables
logic internally:
builder = Ci::Gitlab::Ci::Variables::Builder.new(pipeline)
builder.expand_variable(text)
class Ci::Gitlab::Ci::Variables::Builder
def expand_variable(text)
ExpandVariables.expand(text, context)
end
private
def context
Context.new(pipeline)
end
strong_memoize_attr :context
end
Dependencies
This issue may likely depend on Backend: Change the priority of all predefined ... (#388961 - closed). Having a separation between GitLab-defined variables and user-defined variables, where user-defined variables cannot override GitLab-defined variables would simplify the design as well as clarifying the differences:
- GitLab-defined variables must not contain overlapping keys. This is why representing them as a hash would be a better and more explicit design choice.
- user-defined variables can override each other depending on the variables precedence.
Challenges
Today predefined variables can be overridden by user-defined variables (project/group secrets, pipeline variables, trigger variables, schedule variables, etc.). It's a strange design choice since we can't guarantee that certain variables are really statically defined and users cannot have hard expectations on the values either.