Spike: Decide on monolithic versus composable architecture for pipeline based SCA features
Topic to Evaluate
The new DS analyzer was originally designed to focus exclusively on dependency detection in lockfile and graph files, with security analysis handled by the SBOM scanning logic in the Rails platform. However, two recent developments have created new requirements:
- We've decided to reinstate DS security report generation in the CI job, requiring us to add new logic and possibly restore previously removed logic from gemnasium (report generation).
- The development of Static Reachability, which uses the CycloneDX report artifact, encourages integrating enrichment logic into the existing DS job to prevent the generation of duplicate of SBOM reports.
These requirements raise a fundamental architectural question for our engineering team: Should we:
- Consolidate additional logic within the Dependency Scanning CI job and possibly the new DS analyzer, or
- Maintain discrete components and orchestrate them?
There are also upcoming features on our roadmap that will likely influence this design, so it's a good time to review our options.
While it is impossible to predict exactly what's coming in the future we should try to make decisions that won't paint us or our users into a corner and force another re-design down the road.
CI Jobs organization
As menttioned above, the existing DS feature offers a single CI job that currently provides Lockfile/Graph file parsing and generates SBOM report artifact. To extend our functionalities we have the following options:
- Extend the existing Dependency Scanning CI job to integrate related capabilities (vulnerability scanning, static reachability, license scanning, etc.)
- Proposal: #525958 (comment 2411632117)
- Use a separate CI Job for each distinct task.
- Proposal: #525958 (comment 2409361198)
CI integration
Even if a modular design leveraging different CI jobs is chosen, it's still possible to keep a single entry point for the integration in CI. Thus, we should review the following options:
- A single CI/CD template or CI/CD component (e.g.
SCA.gitlab-ci.yml) that provides all the necessary jobs to enable all related features.- Proposal: #525958 (comment 2411643955)
- Separate CI/CD templates or CI/CD components for each feature or capability
- Proposal: #525958 (comment 2411641456)
Project(s) organization
We have multiple items to consider here: source code, executables, and container images.
Indeed, how the source code providing our functionalities is organized, released, and deployed will have implications for our development team efficiency and our long term maintenance. This will also have impact on cost and usability, both for us and our customers.
It will be relevant to remember the lessons learned from the past. For instance, how we merged the three Gemnasium projects into a single one, but kept building three distinct images. Or the benefits and downsides of the organization of the PMDB components.
Some options to review:
- Monorepo
- Dedicated projects
- Single image with single analyzer
- Single image with multiple analyzers
- Multiple images with their dedicated analyzer
Outcomes
CI Jobs organization
Decision: We will stick to a single CI job that will perform the dependency detection, the vulnerability scanning (upcoming), the static reachability analysis and the SBOM enrichment. Thus there is no need to implement a complex CI configuration involving multiple jobs as suggested in the Modular CI Jobs Proposal
Justification: Recent evolution of the design of our features have reshaped the initial requirements and problem to solve. For instance, this has removed the needs for additional granularity and allow us to stick to a simpler solution. Additionally, this spike has demonstrated that while our entire feature process can be decomposed into distinct tasks, there might not be a lot of value added to expose such decomposition to the end user, particularly when this means exposure to the CI/CD configuration. Indeed, recent experience with the integration of Static Reachability beta based on multiple CI jobs as demonstrated the risks and challenges faced when we have to manage CI jobs orchestration. See #525958 (comment 2493383416)
CI integration
Decision: We will stick to a single CI template (and a single CI/CD component)
Justification: There has been no real benefit identified in offering distinct granular templates or components, even in the event of using a granular approach with separate CI job for each task. See #525958 (comment 2412021386)
Project organization
Decision: We will stick to a single project (the new dependency-scanning analyzer and integrate new capabilities there. Since we can stick to a single DS job, we'll keep building a single image. While we currently build a single binary with distinct commands called separately from the CI job's script, we might envision to revisit this in the near term.
Justification: While the spike has exposed diverse opinions on the pros and cons of various project organization, the immediate need for action to deliver high priority roadmap items has pushed us toward one of the most effective solutions. While this offers velocity, it doesn't close the door to re-adjust this organization later. Though, we have not identified immediate or mid term needs that would require it. See #525958 (comment 2495499516)