Extend Scheduled Scan Execution Policy with Cluster Image Scanning scan

Why are we doing this work

We want to allow customers to collect vulnerabilities from images in running Kubernetes clusters so they can understand their current security risk not only for images that are scanned as a part of CI Pipeline, but also for images that were deployed without using GitLab CI.

You can find more about our motivation to work on this issue here.

This issue is about extending Scheduled Scan Execution Policy to support Cluster Image Scanning scan. Customer will be able to schedule a security scan of running containers on selected time and select which namespace and resource names should be scanned. The namespaces, resources, kinds, and containers fields in policy configuration should be optional. When the user does not provide them, we should fetch all found vulnerabilities in the cluster:

scan_execution_policy:
  name: Run Cluster Image Scan
  description: This policy enforces container scans against a production cluster on a daily basis.
  enabled: true
  rules:
  - type: schedule
    clusters:
      production:
        namespaces:
        - production
        - staging
        resources:
        - nginx
        - knative
        containers:
        - nginx
        - knative
        kinds:
        - deployment
    cadence: 0 0 * * *
  actions:
  - scan: cluster_image_scanning

Relevant links

Epic
Design

Non-functional requirements

Documentation: add documentation to doc/user/application_security/policies/index.md with information how to configure policy to use this analyzer,
[-] Feature flag: no feature flag is needed as this is something that users will optionally select by including the GitLab CI template
[-] Performance:
Testing:
- Test if you can configure policy to ,
- Test if vulnerabilities from security report from cluster_image_scanning analyzer are visible in database

Implementation plan

backend extend ee/app/validators/json_schemas/security_orchestration_policy.json with new action scan type cluster_image_scan and check if namespaces, resources, kinds, and containers fields are valid when provided (both are optional for this new scan type),
backend extend ee/lib/gitlab/ci/config/security_orchestration_policies/processor.rb with logic to extend config when cluster_image_scan is provided in policy (you could create new service similar to Security::SecurityOrchestrationPolicies::OnDemandScanPipelineConfigurationService to prepare the job configuration),
backend extend process_action method in ee/app/services/security/security_orchestration_policies/rule_schedule_service.rb to support new scan cluster_image_scan and start new pipeline with Cluster Image Scan job only (create new service to manage that, like DastOnDemandScans::CreateService),
backend create new service Security::SecurityOrchestrationPolicies::CreatePipelineService that will be responsible for starting new pipeline with selected security scan. In this new service we will create and start new pipeline:

Ci::CreatePipelineService
  .new(project, current_user, ref: branch)
  .execute(:security_orchestration_policy, content: ci_yaml)

backend extend self.sources method in app/models/concerns/enums/ci/pipeline.rb with new value: security_orchestration_policy: 14, add security_orchestration_policy to slice arguments in self.dangling_sources method,
backend extend dangling_build? method in lib/gitlab/ci/pipeline/chain/command.rb with security_orchestration_policy,
backend remember to send values of namespaces and resources fields to job configuration and make sure that analyzer receives and uses these values to properly fetch vulnerabilities from the cluster whenever we are configuring cluster_image_scan job

Edited Sep 21, 2021 by Alan (Maciej) Paruszewski