[Backend] Spike: Pipeline creation flow and policy job injection
## Overview
This spike documents how pipelines are created and how policy jobs are injected into pipelines. Understanding this flow is essential for implementing Security Scan Profiles- based job injection.
## Pipeline Creation Entry Point
Pipeline creation flows use [`Ci::CreatePipelineService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/ci/create_pipeline_service.rb). The service accepts a [`source` parameter](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/enums/ci/pipeline.rb#L36-57) that identifies how the pipeline was triggered (`:web`, `:push`, `:merge_request_event`, etc.).
The service executes a chain of steps defined in `SEQUENCE`. The key steps relevant to policy job injection are:
```ruby
SEQUENCE = [
# ... early validation steps ...
Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::EvaluatePolicies, # PEP evaluation (early)
# ... skip and config validation ...
Gitlab::Ci::Pipeline::Chain::Config::Content, # Config resolution
Gitlab::Ci::Pipeline::Chain::Config::Process, # YAML processing (SEP injection)
# ... workflow rules, seeding, limits ...
Gitlab::Ci::Pipeline::Chain::Populate,
Gitlab::Ci::Pipeline::Chain::PopulateMetadata,
Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::ApplyPolicies, # PEP jobs merged (late)
# ... final creation and metrics ...
]
```
## Step 1: Configuration Resolution
The [`Gitlab::Ci::Pipeline::Chain::Config::Content`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/pipeline/chain/config/content.rb) step resolves the CI configuration content. It instantiates [`Gitlab::Ci::ProjectConfig`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config.rb) which acts as a coordinator that tries multiple sources in priority order to find a valid configuration.
### Source Priority
The [`ProjectConfig`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config.rb) class iterates through `STANDARD_SOURCES` in order, returning the first source that has content:
1. **[Compliance](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/compliance.rb)** - Compliance framework pipelines (highest priority, EE-only logic)
2. **[Parameter](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/parameter.rb)** - Custom content passed as parameter (used by on-demand scans)
3. **[Bridge](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/bridge.rb)** - Downstream pipeline config from bridge job
4. **[ProjectSetting](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/project_setting.rb)** - Project's CI config (local `.gitlab-ci.yml`, remote, or external project)
5. **[AutoDevops](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/auto_devops.rb)** - Auto DevOps template if enabled
If no standard source provides content, the fallback `SecurityPolicyDefault` used. See next step.
### Source Base Class
Each source inherits from [`Gitlab::Ci::ProjectConfig::Source`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/source.rb) and implements `content` (returns YAML or `nil`), `exists?` (returns `true` if content is present), and `source` (returns a symbol identifying the source type).
## Step 2: Fallback for Policy-Only Pipelines
When no standard source provides configuration, [`Gitlab::Ci::ProjectConfig::SecurityPolicyDefault`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/project_config/security_policy_default.rb) enables pipeline creation if security policies are defined. This is how pipelines can run with only policy jobs when there's no `.gitlab-ci.yml`.
The [EE implementation](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/project_config/security_policy_default.rb#L21-31) returns different content based on which policies are applicable:
```ruby
DUMMY_CONTENT = {
'Pipeline execution policy trigger' => {
'stage' => ::Gitlab::Ci::Config::Stages::EDGE_PRE,
'script' => ['echo "Forcing project pipeline to run policy jobs."']
}
}.freeze
def content
if has_applicable_scan_execution_policies_defined?
YAML.dump(nil) # Empty config - just enough to pass validation
elsif has_pipeline_execution_policies_defined?
YAML.dump(DUMMY_CONTENT) # Placeholder job, removed later
end
end
```
For Scan Execution Policies, it returns `YAML.dump(nil)` (empty YAML) which is sufficient to pass validation and allow SEP jobs to be merged during config processing. For Pipeline Execution Policies, it returns `DUMMY_CONTENT` with a placeholder job that gets removed later in the `ApplyPolicies` step when actual policy jobs are merged.
The `has_applicable_scan_execution_policies_defined?` method checks whether there are active SEPs applicable to the current branch by using [`Security::SecurityOrchestrationPolicies::PolicyBranchesService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/security_orchestration_policies/policy_branches_service.rb).
## Step 3: YAML Processing and SEP Job Injection
The [`Gitlab::Ci::Pipeline::Chain::Config::Process`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/pipeline/chain/config/process.rb) step parses the YAML content using [`Gitlab::Ci::YamlProcessor`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/yaml_processor.rb), which instantiates [`Gitlab::Ci::Config`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/config.rb) to perform the actual YAML parsing and expansion.
### For EE
The [`EE::Gitlab::Ci::ConfigEE`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/config_ee.rb#L19-25) module overrides `build_config` to inject policy-related processing into the config expansion pipeline:
```ruby
override :build_config
def build_config(config, inputs)
super
.then { |config| process_required_includes(config) }
.then { |config| enforce_pipeline_execution_policy_stages(config) }
.then { |config| process_security_orchestration_policy_includes(config) }
end
```
The `process_security_orchestration_policy_includes` method calls the SEP processor, but skips injection for PEP pipelines (policy jobs are added only to the main pipeline):
```ruby
def process_security_orchestration_policy_includes(config)
return config if pipeline_policy_context&.pipeline_execution_context&.creating_policy_pipeline?
::Gitlab::Ci::Config::SecurityOrchestrationPolicies::Processor.new(
config, context, source_ref_path, pipeline_policy_context
).perform
end
```
### SEP Processor
The [`Gitlab::Ci::Config::SecurityOrchestrationPolicies::Processor`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/config/security_orchestration_policies/processor.rb) injects security scan jobs from Scan Execution Policies into the CI configuration by merging job definitions into the config hash.
```ruby
def perform
return @config unless scan_execution_policy_context&.has_scan_execution_policies?
return @config if @config[:stages].present? && !Entry::Stages.new(@config[:stages]).valid?
@config[:workflow] = { rules: [{ when: 'always' }] } if @config.empty?
merged_config = @config.deep_merge(merged_security_policy_config)
merged_config[:stages] = cleanup_stages(merged_config[:stages])
merged_config
end
```
The processor handles stage management differently based on scan type. For example, pipeline scans (SAST, Secret Detection, Dependency Scanning, etc) are added to the `test` stage if it exists, otherwise a `scan-policies` stage is created after the `build` stage.
### Scan Pipeline Service
The [`Security::SecurityOrchestrationPolicies::ScanPipelineService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/security_orchestration_policies/scan_pipeline_service.rb) generates the actual job configurations. It receives a list of scan actions from the policy and returns job definitions partitioned into `pipeline_scan` and `on_demand` categories, along with variables to inject.
## Step 4: Pipeline Execution Policy (PEP) Evaluation
The [`Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::EvaluatePolicies`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/pipeline/chain/pipeline_execution_policies/evaluate_policies.rb) step runs early in the chain, before `Config::Content`. It evaluates applicable Pipeline Execution Policies, builds isolated pipeline objects for each policy, and stores them in `command.pipeline_policy_context`.
```ruby
def perform!
command
.pipeline_policy_context
.pipeline_execution_context
.build_policy_pipelines!(pipeline.partition_id) do |error_message|
break error("Pipeline execution policy error: #{error_message}", failure_reason: :config_error)
end
end
```
This early positioning is intentional: it must run before `Config::Content` to force pipeline creation when no `.gitlab-ci.yml` exists, and before `Skip` to enforce pipeline creation regardless of `ci.skip` options.
## Step 5: PEP Job Merging
The [`Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::ApplyPolicies`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/pipeline/chain/pipeline_execution_policies/apply_policies.rb) step runs late in the chain, after `Populate` and `PopulateMetadata`, to merge PEP jobs into the pipeline stages.
```ruby
def perform!
policy_context = command.pipeline_policy_context.pipeline_execution_context
if policy_context.creating_policy_pipeline?
collect_policy_pipeline_stages
elsif policy_context.has_execution_policy_pipelines?
clear_project_pipeline # Remove dummy job if pipeline was forced
merge_policy_jobs # Inject policy jobs into stages
usage_tracking.track_enforcement
end
end
```
The `clear_project_pipeline` method removes the dummy job when the pipeline was forced by a policy with no other config. The `merge_policy_jobs` method iterates through policy pipelines and uses [`Gitlab::Ci::Pipeline::JobsInjector`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/pipeline/jobs_injector.rb) to inject jobs into the appropriate stages.
### Jobs Injector
The `JobsInjector` handles job injection with conflict resolution. It only injects into stages declared in the project config, handles job name conflicts via a callback (typically adding a suffix), updates `needs` references when jobs are renamed, and raises `DuplicateJobNameError` if a conflict can't be resolved.
issue