[SPIKE] Supporting 3rd Party SBOM Documents
Overview
This spike investigates how GitLab can better support 3rd party SBOM (Software Bill of Materials) documents to extend the coverage of our Software Composition Analysis (SCA) features.
Many customers generate SBOMs using external tools (e.g., Syft, Wiz, Snyk, Trivy, CycloneDX generators). Today, ingestion in GitLab is limited by strict metadata requirements and GitLab-specific taxonomy. This can block adoption or lead to degraded results (e.g., missing vulnerabilities or misclassified components).
The goal of this spike is to identify the current limitations, requirements, and possible solutions that would make it easier for customers to adopt 3rd party SBOMs in GitLab, while reducing reliance on GitLab-specific fields.
Objectives
- Understand the current state of 3rd party SBOM ingestion in GitLab and CVS.
- Identify mandatory and optional metadata fields required for correct ingestion and fingerprinting.
- Map CycloneDX schema fields to GitLab taxonomy, highlighting gaps and overlaps.
- Research the output of common 3rd party SBOM generators and their compatibility with GitLab.
- Explore approaches to handle mixed SBOMs (OS-level + application-level components) without strict CS vs DS separation.
- Propose solutions to improve compatibility, user experience, and documentation.
- Document risks (e.g., migrations required if
input_fileis missing) and fallback mechanisms.
Research Questions
CycloneDX Schema and GitLab Requirements
1. Schema & Taxonomy Alignment
- What are the mandatory metadata fields required for GitLab to properly ingest an SBOM?
- Analyze the importance of
input_file,purl_type,package_manager, etc. - Determine which fields are critical / required vs. nice-to-have
- Analyze the importance of
- How do the GitLab taxonomy requirements align with the standard CycloneDX schema?
- Can we map standard CycloneDX fields to our required GitLab taxonomy fields?
- Are there opportunities to reduce our reliance on GitLab-specific taxonomy?
- What are the implications of missing SBOM GitLab metadata properties (like
input_file) on vulnerability fingerprinting?- Would such changes require migrations?
- Can we develop fallback mechanisms for incomplete SBOMs - perhaps by using project root instead?
2. 3rd Party SBOM Generators
- What are the most commonly used 3rd party SBOM generators in the industry?
- Survey tools like Wiz, Snyk, Trivy, Syft, CycloneDX generators, etc.
- Analyze their output format and compatibility with GitLab requirements.
- Can these generators be configured to include the GitLab-required fields?
- Research configuration options for popular generators
- Identify gaps where custom post-processing might be needed
3. Feature-Specific Requirements
- How can we better handle SBOMs that contain both OS-level and application-level components?
- Analyze the current CS vs DS distinction and its limitations
- Explore options for unified vulnerability detection regardless of component type
- What are the specific requirements for each downstream feature?
- Dependency Scanning
- License Scanning
- Continuous Vulnerability Scanning
- Container Scanning
User Experience and Adoption
- What are the main pain points customers face when trying to use 3rd party SBOMs with GitLab?
- Review support tickets and customer feedback
- Identify common failure patterns
- How can we improve the documentation to guide users in preparing compatible SBOMs?
Methodology
-
Code Review
- Analyze current SBOM ingestion flow in GitLab and CVS.
- Identify where taxonomy and metadata validation occurs.
-
Documentation Review
- Study CycloneDX specification (esp. optional vs required fields).
- Review GitLab documentation on SBOM ingestion and taxonomy.
-
Tool Analysis
- Generate SBOMs with common tools (Syft, Wiz, Snyk, Trivy, CycloneDX CLI).
- Compare output with GitLab requirements.
-
🗒️ Note: Some SBOMs (like from Wiz or Snyk) may not be directly generated via open-source tools. In such cases, SBOMs are artifacts of proprietary scanners. To cover these, we will collaborate with Support, Solution Architects, and Customer Support Managers to obtain representative SBOM samples for analysis.
-
Customer Research
- Review support tickets and internal feedback from Support, SA, and PM teams.
- Collect real-world SBOM samples if available.
-
Prototype
- Build proof-of-concept ingestion for at least Wiz, Snyk and CycloneDX SBOM generators.
- Test fallback strategies for missing metadata (e.g., project root as
input_file).
Success Criteria
- A clear, documented mapping of GitLab-required metadata to CycloneDX schema fields.
- Identified gaps where GitLab requires custom fields, with recommendations for reducing this reliance.
- Analysis of at least 3 major 3rd party SBOM generators and their config options.
- Concrete recommendations for improving ingestion.
- Prototype or example ingestion demonstrating feasibility of improvements.
This spike will help us understand how to better support customers who are generating their own SBOMs using 3rd party tools, ultimately increasing adoption of GitLab Ultimate security features.
References
This spike is created as part of Epic 14760 and aims to investigate how we can implement the epic’s goals.