[Backend] Spike: Pipeline creation flow and policy job injection (#586877) · Issues · GitLab.org / GitLab

[Backend] Spike: Pipeline creation flow and policy job injection

## Overview This spike documents how pipelines are created and how policy jobs are injected into pipelines. Understanding this flow is essential for implementing Security Scan Profiles- based job injection. ## Pipeline Creation Entry Point Pipeline creation flows use [`Ci::CreatePipelineService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/ci/create_pipeline_service.rb). The service accepts a [`source` parameter](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/enums/ci/pipeline.rb#L36-57) that identifies how the pipeline was triggered (`:web`, `:push`, `:merge_request_event`, etc.). The service executes a chain of steps defined in `SEQUENCE`. The key steps relevant to policy job injection are: ```ruby SEQUENCE = [ # ... early validation steps ... Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::EvaluatePolicies, # PEP evaluation (early) # ... skip and config validation ... Gitlab::Ci::Pipeline::Chain::Config::Content, # Config resolution Gitlab::Ci::Pipeline::Chain::Config::Process, # YAML processing (SEP injection) # ... workflow rules, seeding, limits ... Gitlab::Ci::Pipeline::Chain::Populate, Gitlab::Ci::Pipeline::Chain::PopulateMetadata, Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::ApplyPolicies, # PEP jobs merged (late) # ... final creation and metrics ... ] ``` ## Step 1: Configuration Resolution The [`Gitlab::Ci::Pipeline::Chain::Config::Content`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/pipeline/chain/config/content.rb) step resolves the CI configuration content. It instantiates [`Gitlab::Ci::ProjectConfig`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config.rb) which acts as a coordinator that tries multiple sources in priority order to find a valid configuration. ### Source Priority The [`ProjectConfig`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config.rb) class iterates through `STANDARD_SOURCES` in order, returning the first source that has content: 1. **[Compliance](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/compliance.rb)** - Compliance framework pipelines (highest priority, EE-only logic) 2. **[Parameter](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/parameter.rb)** - Custom content passed as parameter (used by on-demand scans) 3. **[Bridge](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/bridge.rb)** - Downstream pipeline config from bridge job 4. **[ProjectSetting](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/project_setting.rb)** - Project's CI config (local `.gitlab-ci.yml`, remote, or external project) 5. **[AutoDevops](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/auto_devops.rb)** - Auto DevOps template if enabled If no standard source provides content, the fallback `SecurityPolicyDefault` used. See next step. ### Source Base Class Each source inherits from [`Gitlab::Ci::ProjectConfig::Source`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/project_config/source.rb) and implements `content` (returns YAML or `nil`), `exists?` (returns `true` if content is present), and `source` (returns a symbol identifying the source type). ## Step 2: Fallback for Policy-Only Pipelines When no standard source provides configuration, [`Gitlab::Ci::ProjectConfig::SecurityPolicyDefault`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/project_config/security_policy_default.rb) enables pipeline creation if security policies are defined. This is how pipelines can run with only policy jobs when there's no `.gitlab-ci.yml`. The [EE implementation](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/project_config/security_policy_default.rb#L21-31) returns different content based on which policies are applicable: ```ruby DUMMY_CONTENT = { 'Pipeline execution policy trigger' => { 'stage' => ::Gitlab::Ci::Config::Stages::EDGE_PRE, 'script' => ['echo "Forcing project pipeline to run policy jobs."'] } }.freeze def content if has_applicable_scan_execution_policies_defined? YAML.dump(nil) # Empty config - just enough to pass validation elsif has_pipeline_execution_policies_defined? YAML.dump(DUMMY_CONTENT) # Placeholder job, removed later end end ``` For Scan Execution Policies, it returns `YAML.dump(nil)` (empty YAML) which is sufficient to pass validation and allow SEP jobs to be merged during config processing. For Pipeline Execution Policies, it returns `DUMMY_CONTENT` with a placeholder job that gets removed later in the `ApplyPolicies` step when actual policy jobs are merged. The `has_applicable_scan_execution_policies_defined?` method checks whether there are active SEPs applicable to the current branch by using [`Security::SecurityOrchestrationPolicies::PolicyBranchesService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/security_orchestration_policies/policy_branches_service.rb). ## Step 3: YAML Processing and SEP Job Injection The [`Gitlab::Ci::Pipeline::Chain::Config::Process`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/pipeline/chain/config/process.rb) step parses the YAML content using [`Gitlab::Ci::YamlProcessor`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/yaml_processor.rb), which instantiates [`Gitlab::Ci::Config`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/config.rb) to perform the actual YAML parsing and expansion. ### For EE The [`EE::Gitlab::Ci::ConfigEE`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/config_ee.rb#L19-25) module overrides `build_config` to inject policy-related processing into the config expansion pipeline: ```ruby override :build_config def build_config(config, inputs) super .then { |config| process_required_includes(config) } .then { |config| enforce_pipeline_execution_policy_stages(config) } .then { |config| process_security_orchestration_policy_includes(config) } end ``` The `process_security_orchestration_policy_includes` method calls the SEP processor, but skips injection for PEP pipelines (policy jobs are added only to the main pipeline): ```ruby def process_security_orchestration_policy_includes(config) return config if pipeline_policy_context&.pipeline_execution_context&.creating_policy_pipeline? ::Gitlab::Ci::Config::SecurityOrchestrationPolicies::Processor.new( config, context, source_ref_path, pipeline_policy_context ).perform end ``` ### SEP Processor The [`Gitlab::Ci::Config::SecurityOrchestrationPolicies::Processor`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/config/security_orchestration_policies/processor.rb) injects security scan jobs from Scan Execution Policies into the CI configuration by merging job definitions into the config hash. ```ruby def perform return @config unless scan_execution_policy_context&.has_scan_execution_policies? return @config if @config[:stages].present? && !Entry::Stages.new(@config[:stages]).valid? @config[:workflow] = { rules: [{ when: 'always' }] } if @config.empty? merged_config = @config.deep_merge(merged_security_policy_config) merged_config[:stages] = cleanup_stages(merged_config[:stages]) merged_config end ``` The processor handles stage management differently based on scan type. For example, pipeline scans (SAST, Secret Detection, Dependency Scanning, etc) are added to the `test` stage if it exists, otherwise a `scan-policies` stage is created after the `build` stage. ### Scan Pipeline Service The [`Security::SecurityOrchestrationPolicies::ScanPipelineService`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/security_orchestration_policies/scan_pipeline_service.rb) generates the actual job configurations. It receives a list of scan actions from the policy and returns job definitions partitioned into `pipeline_scan` and `on_demand` categories, along with variables to inject. ## Step 4: Pipeline Execution Policy (PEP) Evaluation The [`Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::EvaluatePolicies`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/pipeline/chain/pipeline_execution_policies/evaluate_policies.rb) step runs early in the chain, before `Config::Content`. It evaluates applicable Pipeline Execution Policies, builds isolated pipeline objects for each policy, and stores them in `command.pipeline_policy_context`. ```ruby def perform! command .pipeline_policy_context .pipeline_execution_context .build_policy_pipelines!(pipeline.partition_id) do |error_message| break error("Pipeline execution policy error: #{error_message}", failure_reason: :config_error) end end ``` This early positioning is intentional: it must run before `Config::Content` to force pipeline creation when no `.gitlab-ci.yml` exists, and before `Skip` to enforce pipeline creation regardless of `ci.skip` options. ## Step 5: PEP Job Merging The [`Gitlab::Ci::Pipeline::Chain::PipelineExecutionPolicies::ApplyPolicies`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/ci/pipeline/chain/pipeline_execution_policies/apply_policies.rb) step runs late in the chain, after `Populate` and `PopulateMetadata`, to merge PEP jobs into the pipeline stages. ```ruby def perform! policy_context = command.pipeline_policy_context.pipeline_execution_context if policy_context.creating_policy_pipeline? collect_policy_pipeline_stages elsif policy_context.has_execution_policy_pipelines? clear_project_pipeline # Remove dummy job if pipeline was forced merge_policy_jobs # Inject policy jobs into stages usage_tracking.track_enforcement end end ``` The `clear_project_pipeline` method removes the dummy job when the pipeline was forced by a policy with no other config. The `merge_policy_jobs` method iterates through policy pipelines and uses [`Gitlab::Ci::Pipeline::JobsInjector`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/pipeline/jobs_injector.rb) to inject jobs into the appropriate stages. ### Jobs Injector The `JobsInjector` handles job injection with conflict resolution. It only injects into stages declared in the project config, handles job name conflicts via a callback (typically adding a suffix), updates `needs` references when jobs are renamed, and raises `DuplicateJobNameError` if a conflict can't be resolved.

issue