Add brief description of how License Compliance detects licenses
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem to solve
Customers would like to know how our License Compliance detects licenses. There is currently no documentation on the topic.
Further details
This issue was raised as a result of an internal Slack discussion.
Proposal
Add the following information to https://docs.gitlab.com/ee/user/compliance/license_compliance/:
"The detection is based on metadata exposed in the package, and any LICENSE.* files detected in the source package. These licenses are compared against a fixed list of licenses in license finder. On top of that we do our best to match names against the catalogue of software licenses from the SPDX. https://spdx.org/licenses/licenses.json. If the package has metadata like a .gemspec we use that and we install the package and scan the install dir for files like LICENSE and do a best guess of the license(s) in that file(s). The scanner does not scan source files for embedded licenses in these files.
If an SPDX expression is detected (.i.e A OR B, A AND B, A WITH Exception-B) we parse that and specify them as a list of licenses but we don't do a good job of indicating the logical relationship of those licenses in the UI today."
Who can address the issue
Anyone.
Other links/references
Source Slack thread: https://gitlab.slack.com/archives/C8S0HHM44/p1602690385478300