Replace license-finder MVC
<!-- The first section "Release notes" is required if you want to have your release post blog MR auto generated. Currently in BETA, details on the **release post item generator** can be found in the handbook: https://about.gitlab.com/handbook/marketing/blog/release-posts/#release-post-item-generator and this video: https://www.youtube.com/watch?v=rfn9ebgTwKg. The next four sections: "Problem to solve", "Intended users", "User experience goal", and "Proposal", are strongly recommended in your first draft, while the rest of the sections can be filled out during the problem validation or breakdown phase. However, keep in mind that providing complete and relevant information early helps our product team validate the problem and start working on a solution. --> ### Release notes <!-- What is the problem and solution you're proposing? This content sets the overall vision for the feature and serves as the release notes that will populate in various places, including the [release post blog](https://about.gitlab.com/releases/categories/releases/) and [Gitlab project releases](https://gitlab.com/gitlab-org/gitlab/-/releases). " --> ### Strategic context GitLab needs an end-to-end License Compliance solution with an underlying architecture that will allow it to address a more comprehensive set of user needs. Some of these needs include the following (in no particular order): 1. The ability to evaluate license information from all available sources (the package metadata, the package registry, and the `LICENSE` file itself). All, some, or none of this information may exist and some of it may conflict with itself. 1. The ability to handle complex licenses where multiple licenses are joined with `AND`/`OR` clauses. 1. The ability to accept user-provided license information for private packages or for registries where GitLab does not have license information available. 1. The ability to allow users to override license information provided by GitLab. 1. The ability to have license information updated if/when a more accurate license is discovered for a particular component without needing to rerun a pipeline job. 1. The ability to manage License Approval policies at the group and sub-group levels. 1. The ability to eventually expand beyond just vulnerability and license data to also provide management of other risk vectors for a package, such as package quality, maintainership, software supply chain risk, etc. 1. The ability to provide for alerts and automatic generation of merge requests when new versions of packages become available. **Not all of these needs are in-scope** for this particular epic. Rather, this Epic is focused on establishing an underlying architecture that will set us up for successfully addressing those needs over time. Additionally, we already need to start storing dependency information in the database so that we can begin offering users the ability to do [Continuous Vulnerability Scanning](https://gitlab.com/groups/gitlab-org/-/epics/7886). Also, we plan on ingesting, storing, and comparing data from our [GitLab Advisories Database](https://gitlab.com/gitlab-org/security-products/gemnasium-db) as part of our Dependency Scanning offering. Once that work is completed, we will ideally be able to share a large amount of code with that solution by simply extending it to also add in the ability to ingest and compare license information. This will not only make the initial implementation cost-efficient, it will also minimize our long-term maintenance burden as well. Additionally, our users will be able to benefit from this because they will be able to identify the list of project dependencies just once when they run the Dependency Scanning analyzer. This will effectively eliminate the need to run the License Compliance job entirely, saving them the time and pipeline minutes of that extra job. Yet another side-benefit of this approach is that it will also allow for us to do Continuous License Scanning. If license data is not available or is inaccurate for a package when the pipeline is originally run, as soon as we do get license data in for that package, we can then update that information on the server side without requiring a new pipeline run. ### Problem to solve <!-- What problem do we solve? Try to define the who/what/why of the opportunity as a user story. For example, "As a (who), I want (what), so I can (why/value)." --> For this epic, we are focused on solving for the following: 1. As a Developer or DevOps Engineer, I want to be able to get dependency, vulnerability, and license information from the same job so that I can minimize the number of pipeline minutes that I use. 1. As a Developer or as a member of the Compliance team, I want to be able to provide license information for private projects or to override the license information that was detected by GitLab, so that I can correct license data that GitLab is unable to identify correctly. 1. As a member of my Compliance team, I want to be able to manage License Approval Policies at the group and sub-group level so that I can enforce those policies uniformly across all of my projects. (Note: This use case will not be met by this Epic alone as it will also require the work from https://gitlab.com/groups/gitlab-org/-/epics/8092) ### Relevant Background ![Data_Flow_2](/uploads/c3a84f20822b01269d19047862736063/Data_Flow_2.png) We are proposing to rearchitect the way that license and dependency information is handled in GitLab. See https://gitlab.com/groups/gitlab-org/-/epics/7886 for details. ### Proposal <!-- How are we going to solve the problem? Try to include the user journey! https://about.gitlab.com/handbook/journeys/#user-journey --> 1. The GitLab [license-finder](https://gitlab.com/gitlab-org/security-products/analyzers/license-finder) project will be deprecated (date of deprecation is TBD) and removed in a major release (likely the %16.0 release; however, this has not yet been determined for certain). 1. Instead of recreating license-finder, we will instead create a license service in the Rails code. This service will do the following: 1. Accept a list of dependencies by ingesting CylconeDX SBOMs. This is covered by https://gitlab.com/groups/gitlab-org/-/epics/8024. 1. Query the database to see if license data is available for each dependency. 1. Return the same list of dependencies together with corresponding license data based on the following priority order: 1. Highest priority is given to matches for license information that is reported by up a CycloneDX file in the same project or by another project in the same top-level group. 1. Second priority is given to license data that was imported from our [external license database](https://gitlab.com/groups/gitlab-org/-/epics/8492) because this is trusted GitLab-collected data directly from the upstream registries. 1. A potential **future** third priority could be given to unverified, crowd-sourced information that was reported up through CycloneDX files outside of the current group; however, this is **out of scope** for this epic as we would need to consider adding additional privacy / opt-in / opt-out data sharing controls for users. 1. The External License Database will be automatically imported into the GitLab database on a regular basis. Self-managed users will be able to customize the URL location of the database file that is fetched so that internet-disconnected environments can still use the license scanning feature. 1. Depending on our estimated database size, we may need to consider how to provide a way to clean up or delete license data for dependencies that are no longer used to avoid having the total table size grow indefinitely. ### Further details - ~backend engineering DRI: @fcatteau <!-- Include use cases, benefits, goals, or any other details that will help us understand the problem better. --> ### Permissions and Security <!-- What permissions are required to perform the described actions? Are they consistent with the existing permissions as documented for users, groups, and projects as appropriate? Is the proposed behavior consistent between the UI, API, and other access methods (e.g. email replies)? Consider adding checkboxes and expectations of users with certain levels of membership https://docs.gitlab.com/ee/user/permissions.html * [ ] Add expected impact to members with no access (0) * [ ] Add expected impact to Guest (10) members * [ ] Add expected impact to Reporter (20) members * [ ] Add expected impact to Developer (30) members * [ ] Add expected impact to Maintainer (40) members * [ ] Add expected impact to Owner (50) members --> Permissions will not change as a result of this Epic ### Documentation <!-- See the Feature Change Documentation Workflow https://docs.gitlab.com/ee/development/documentation/workflow.html#for-a-product-change * Add all known Documentation Requirements in this section. See https://docs.gitlab.com/ee/development/documentation/feature-change-workflow.html#documentation-requirements * If this feature requires changing permissions, update the permissions document. See https://docs.gitlab.com/ee/user/permissions.html --> ### Availability & Testing <!-- This section needs to be retained and filled in during the workflow planning breakdown phase of this feature proposal, if not earlier. What risks does this change pose to our availability? How might it affect the quality of the product? What additional test coverage or changes to tests will be needed? Will it require cross-browser testing? Please list the test areas (unit, integration and end-to-end) that needs to be added or updated to ensure that this feature will work as intended. Please use the list below as guidance. * Unit test changes * Integration test changes * End-to-end test change See the test engineering planning process and reach out to your counterpart Software Engineer in Test for assistance: https://about.gitlab.com/handbook/engineering/quality/test-engineering/#test-planning --> ### What does success look like, and how can we measure that? <!-- Define both the success metrics and acceptance criteria. Note that success metrics indicate the desired business outcomes, while acceptance criteria indicate when the solution is working correctly. If there is no way to measure success, link to an issue that will implement a way to measure this. --> ### What is the type of buyer? <!-- What is the buyer persona for this feature? See https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/buyer-persona/ In which enterprise tier should this feature go? See https://about.gitlab.com/handbook/product/pricing/#four-tiers --> ~"GitLab Ultimate" ### Is this a cross-stage feature? <!-- Communicate if this change will affect multiple Stage Groups or product areas. We recommend always start with the assumption that a feature request will have an impact into another Group. Loop in the most relevant PM and Product Designer from that Group to provide strategic support to help align the Group's broader plan and vision, as well as to avoid UX and technical debt. https://about.gitlab.com/handbook/product/#cross-stage-features --> ### Links / references <!-- Label reminders - you should have one of each of the following labels if you can figure out the correct ones --> https://gitlab.com/groups/gitlab-org/-/epics/4082 <!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION --> *This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.* <!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
epic