[meta] Maintainability of Gemnasium
Topic to Evaluate
Gemnasium's primary functions are:
- Identify dependencies in a project.
- Generate GitLab-compatible reports.
In order to perform task (1), Gemnasium relies on (a) lockfiles and/or (b) build artifacts. This creates a tight coupling with the target framework and with the build process, which can require expert domain knowledge.
For task (2), Gemnasium compares the information from task (1) against known security advisories, and generates a GitLab dependency scanning report with all the dependencies and any existing vulnerabilities. More recently, it also generates a CDX report with the dependencies but no vulnerability information.
When it was first introduced to GitLab, about 6 years ago, there wasn't a single tool that could inspect all - or even most - of the required target frameworks. Creating a polyglot tool allowed GitLab to support any frameworks, but also reduced the complexity of managing multiple tools for frameworks that did happen to have support readily available.
Today, there are open-source tools that can generate a dependency list for multiple frameworks (e.g. cdxgen, syft) . Many package managers also either include this functionality (e.g. as a plugin) or have a specific tool that can do it (examples).
As for task (2), there are also tools that can take a dependency list as an input, and generate a security report as the output (e.g. osv-scanner, trivy).
While both primary functions of Gemnasium can potentially be replaced, there are functions that may still require a GitLab-specific tool:
- GitLab doesn't yet support all CDX properties (e.g.: vulnerabilities and requires specific taxonomy for some functionality (e.g. dependency lists).
- Control and aggregation of advisory database (e.g. GLAD).
- Ability to address security vulnerabilities in the tools
- Other GitLab-specific functionality (e.g. generalized vulnerability details)
- Ability to continue releasing the tool should the upstream sources become unavailable
The necessity for this GitLab-specific tool was also highlighted as part of recent deliveries:
- DS for Android: the output from the upstream tool needs to be converted to CDX. A security report needs to be generated based on the CDX.
- DS for Swift: a security report needs to be generated based on the CDX.
In summary, it seems likely that we'll continue to need a Gemnasium-like tool, but with much of the functionality outsourced. The question is: do we transform Gemnasium into such a tool, or do we start a new project?
Tasks to Evaluate
While we have #434143 to evaluate the tools to generate CDX reports, this issue focuses on the strategy used to adopt said tools.
Some of the outcomes could be:
Keep Gemnasium
Keep Gemnasium, its related projects, and the SDLC much like today. Gradually replace and remove functionality.
This choice involves the least amount of change to the current development practices. This could be both a pro and a con.
Gemnasium has grown to be a complex application. There's a significant learning curve to become proficient on developing it and, even then, there may still be tasks that are challenging.
Improve Gemnasium
Improve Gemnasium, its related projects, and the SDLC to increase delivery speed. Then gradually replace and remove functionality
Before relying on Gemnasium for new functionality, identify the current deficiencies and work to address them.
Start a new project
Start a new project. Reuse code from Gemnasium where appropriate.
If we believe that it takes less time to create a new project than to improve Gemnasium, we could start fresh. This eliminates baggage, and cuts through discussions of how to increase development efficiency in Gemnasium as the cost of needing to deprecate and support Gemnasium for 2 years.