Record error rate on security scan reports
What does this MR do and why?
This MR adds the following:
-
Gitlab::Metrics::SecurityScanSlis
- an application SLI which provides the error rate metrics - Increment the metrics in
Security::StoreGroupedScansService
after security scan reports are parsed and stored.
After security scans are finished in CI, the report ingestion starts by the Security::StoreScansWorker
which eventually calls Security::StoreGroupedScansService
. The latter service parses the reports and stores in DB. More information https://docs.gitlab.com/ee/development/sec/security_report_ingestion_overview.html#scan-runs-in-a-pipeline-for-a-non-default-branch.
Here, we emit the error rate metrics based on the scan reports status. This will give us better observability on the security scanners performance.
Note that currently scan reports always have "status": "success"
because crashed scan jobs don't produce any report. This will be resolved in #241342. Once reports are generated on job failure too, the metrics will automatically emit an error too.
References
Please include cross links to any resources that are relevant to this MR This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.
Part of First iteration of a User Journey SLI for Secur... (gitlab-com/gl-infra&1393 - closed).
Resolves gitlab-com/gl-infra/scalability#3908.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
Before | After |
---|---|
How to set up and validate locally
- Enable local runner in gdk
- Setup a project with a JSON security report (or import this one from URL)
- Trigger a new pipeline on said project
- Ensure the pipeline is successful
- Wait until background job has kicked off to store reports (once complete you'll see results under
http://gdk.test:3000/<MY_PROJECT_PATH>/-/security/vulnerability_report
- Navigate to sidekiq's metrics exporter endpoint http://gdk.test:3807/metrics
- Search for
gitlab_sli_security_scan_error_total
$ curl http://gdk.test:3807/metrics | grep security_scan
# TYPE gitlab_sli_security_scan_error_total counter
gitlab_sli_security_scan_error_total{feature_category="container_scanning",scan_type="cluster_image_scanning"} 0
gitlab_sli_security_scan_error_total{feature_category="container_scanning",scan_type="container_scanning"} 0
gitlab_sli_security_scan_error_total{feature_category="container_scanning",scan_type="container_scanning_for_registry"} 0
gitlab_sli_security_scan_error_total{feature_category="dynamic_application_security_testing",scan_type="api_fuzzing"} 0
gitlab_sli_security_scan_error_total{feature_category="dynamic_application_security_testing",scan_type="dast"} 0
gitlab_sli_security_scan_error_total{feature_category="fuzz_testing",scan_type="coverage_fuzzing"} 0
gitlab_sli_security_scan_error_total{feature_category="secret_detection",scan_type="secret_detection"} 0
gitlab_sli_security_scan_error_total{feature_category="software_composition_analysis",scan_type="dependency_scanning"} 0
gitlab_sli_security_scan_error_total{feature_category="static_application_security_testing",scan_type="sast"} 0
# HELP gitlab_sli_security_scan_total Multiprocess metric
# TYPE gitlab_sli_security_scan_total counter
gitlab_sli_security_scan_total{feature_category="container_scanning",scan_type="cluster_image_scanning"} 1
gitlab_sli_security_scan_total{feature_category="container_scanning",scan_type="container_scanning"} 1
gitlab_sli_security_scan_total{feature_category="container_scanning",scan_type="container_scanning_for_registry"} 0
gitlab_sli_security_scan_total{feature_category="dynamic_application_security_testing",scan_type="api_fuzzing"} 1
gitlab_sli_security_scan_total{feature_category="dynamic_application_security_testing",scan_type="dast"} 2
gitlab_sli_security_scan_total{feature_category="fuzz_testing",scan_type="coverage_fuzzing"} 1
gitlab_sli_security_scan_total{feature_category="secret_detection",scan_type="secret_detection"} 1
gitlab_sli_security_scan_total{feature_category="software_composition_analysis",scan_type="dependency_scanning"} 0
gitlab_sli_security_scan_total{feature_category="static_application_security_testing",scan_type="sast"} 3