Introduce SbomScan model and Uploader for DS using SBOM

Overall goal

This is a new approach to our Dependency Scanning feature that reuses the GitLab Sbom Vulnerability Scanner developed in the rails platform for scanning SBOM files generated in a CI job. This allows GitLab to centralize the Dependency Scanning feature on a single scanning engine for various scanning contexts. Please see the epic Bring security scan results back into the Depen... (&17150 - closed) for more details.

This MR is part of a stacked diff:

  1. ➡️ Introduce SbomScan model and Uploader for DS using SBOM (!195058 (merged))
  2. Add services to create and process SbomScan models (!195059 (merged))
  3. Add service and worker to destroy expired SbomScan models (!195061 (merged))
  4. Add Sbom Scan API endpoints with direct-upload support (!195062 (merged))

Thus, the complete implementation with all code changes is available in the last MR (!195062 (merged)).

Notes:

  • This feature will be released behind the feature flag dependency_scanning_sbom_scan_api included in the last MR (with the "user facing" API).
  • These SBOM scans are only meant to be ephemeral and thus they don't interact with the Vulnerability Management system.

Video recordings

What does this MR do and why?

The changes create a new sbom_vulnerability_scans table that can store information about security scans, including the scan status (created, running, finished, or failed), and two types of files: the original SBOM document and the vulnerability scan results. The SBOM file will be uploaded by a running CI job using direct-upload and the scan result will be uploaded by a sidekiq job after performing the vulnerability scan. Each scan record is linked to a specific project and CI job.

The added table belongs to the gitlab_sec schema, which has been recently introduced to decompose security related data from the main schema.

The implementation includes proper file upload handling with support for both local and cloud storage, database relationships to ensure data integrity when related records are deleted, and validation rules to ensure scans have the required files based on their current status.

Table growth

What is the anticipated growth for the new table over the next 3 months, 6 months, 1 year? What assumptions are these based on?

Based on current usage of our Dependency Scanning feature, we have around 100K scans per day. So we expect at least 100K records to be created every day. There is a likelyhood to have more than that since a single DS job can generate multiple SBOM documents, so multiple records could be created. The exact factor is not known today. We don't have metrics covering this detail, but we maybe can work something out on a clone of the production DB to count the number of SBOM artifacts per CI job.

The records stored in that table will be ephemeral and the retention policy is currently set to 2 day (You can read more about removal of records in the 3rd MR: Add service and worker to destroy expired SbomS... (!195061 - merged)).

This means we expect the table to top at around 300K records. This number will also be much smaller initially as we don't expect all existing customers to immediately onboard on this new feature. We aim to remove the old feature in 19.0 so we should expect to reach that number within a year. It's also probable that this number will continue to grow slowly over the course of the next years as we increase adoption and more customers use the feature. That growth rate is unknown. At the same time, we might invest in various caching strategies to avoid useless scans and thus reduce that number.

How many reads and writes per hour would you expect this table to have in 3 months, 6 months, 1 year? Under what circumstances are rows updated? What assumptions are these based on?

Sbom Scans happens in burst so the load is not equally spread during the day. Though, considering the 100K DS jobs per day, with an average 2 SBOMs per job that would be 8K Sbom Scans per hour.

  • 1 write query at creation
  • 1 read query when fetching the record (sidekiq job)
  • 1 write query when setting status to "running"
  • 1 write query when setting status to "finish" and storing the scan results file (or setting status to "failed" if an error happens).
  • 1 read query when API fetches the scan results

This means:

  • 24K write queries per hour
  • 16K read queries per hour

Based on the anticipated data volume and access patterns, does the new table pose an availability risk to GitLab.com or GitLab Self-Managed instances? Does the proposed design scale to support the needs of GitLab.com and GitLab Self-Managed customers?

Considering the ephemeral nature of these record, the table itself doesn't pose availability risk. Scaling issues are rather around the sidekiq capacity, which is futher explained in the 2nd MR and the corresponding thread in the issue. The data clean up must scale accordingly and is described in the 3rd MR and the corresponding thread in the issue.

References

Screenshots or screen recordings

Before After

How to set up and validate locally

This MR only contains a subset of the code changes required for the feature. Please use the last MR of the stacked diff (!195062 (merged)) to test the feature.

This particular MR's changes can be tested in isolation by checking out the branch, running the DB migration and running the tests:

git checkout ogonzalez-Add-DS-Sbom-Scan-API-6db15c85
bin/rails db:migrate:up:main RAILS_ENV=development --trace
bundle exec rspec ee/spec/models/security/vulnerability_scanning/sbom_scan_spec.rb ee/spec/uploaders/security/vulnerability_scanning/sbom_scan_uploader_spec.rb ee/spec/uploaders/security/vulnerability_scanning/sbom_scan_uploader_spec.rb

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Olivier Gonzalez

Merge request reports

Loading