Export licenses for ranges of versions

changed milestone to %Backlog

added Category:Software Composition Analysis License-DBdevelopment backend featureenhancement groupcomposition analysis typefeature labels

added devopssecure sectionsec labels

mentioned in issue #442419 (closed)

Ideally we would use semver_dialects in PMDB, possibly wrapped up in a service.

@fcatteau How do you envision this? Would it be a ruby app that can be used with an API? Or perhaps we could build a docker image that contains both ruby and go and then have the exporter call the semver ruby app?

@nilieskou Yes, that would be an internal service the license-exporter would query using an internal API. This would be much faster than creating a new process each time the exporter needs to sort or compare versions.

We would have a dedicated image for this.

We would implement a minimal HTTP server and an API on top of semver_dialects.

There would be no need for authentication or rate limiting for that internal API.

changed the description

mentioned in issue #462854 (closed)

changed the description

Alternatively, we could have a service that compresses raw NDJSON files (i.e. where all versions are listed) similar to the compression logic implemented in license-exporter. That service would be implemented in Ruby and would use semver_dialects.

As discussed on Slack this approach would be a good first step, and it could already improve the accuracy of v2 exports, as well as the consistency b/w v2 exports and how they're processed by the backend.

See #462854 (comment 1920685160)

Ideally this should be a lossless compression, but b/c of discrepancies in version ordering (b/w the license-exporter using go-version and the backend using semver_dialects), it's not.

TODO: I should create an issue for this, and make it a blocking issue.

marked this issue as related to #464278

mentioned in issue #464278

marked this issue as related to #462857 (closed)

@fcatteau recently noticed that v2 dataset is actually pretty close in size to the v1 dataset (internal link). Upon further investigation it seems that other_licenses contribute to the size quite a bit https://gitlab.com/-/snippets/3716827. This attribute is the one without version ranges. Rather, it simply lists the versions and their licenses.

@ifrenkel Thanks a lot for collecting these numbers! I'm adding this to the Problem to solve section.

The compression implemented in v2 is efficient if the vast majority of versions can be parsed and compared, and if they have the default licenses. Unfortunately that doesn't seem to be the case.

Maybe this is because many versions can't be parsed and compared using go-version. In that case the efficiency of v2 exports can be improved by using semver_dialects in license-exporter. If that's because we have too many versions that don't have the default licenses, then v2 exports can't be improved, and we need to implement this issue.

changed the description

mentioned in issue #470151 (closed)

mentioned in issue #474606 (closed)

mentioned in merge request !165184 (merged)

mentioned in issue #499139

mentioned in issue #491600

changed the description

changed milestone to %17.9

@tkopel Any idea why this issue is planned for 17.9? This issue introduces a v3 export format. I think it is more important to work on using semver dialect in the exporter since this is causing many issues for maven packages. Moreover @hacks4oats work on License expressions might require a v3 export format. So maybe it's worth investigating if it can be combined so that we don't generate too many new format versions. WDYT?

@nilieskou I do't disagree - seems like @johncrowley would know. John?

@nilieskou - this was my oversight. I will remove from %17.9 so that we can focus on the semver dialect in the exporter. cc: @tkopel

We are eagerly anticipating significant improvements in license detection capabilities, as the current implementation is a critical blocker for us. Specifically, the inconsistency and unreliability detection when using policy for whitelisted licenses are preventing us from effectively managing compliance. Therefore we also appreciate this improvements to be delivered soon. @annabaur fyi

@johncrowley This is as well related to my comment in the epic.

changed milestone to %Backlog

mentioned in issue #508466 (closed)

Export licenses for ranges of versions

Problem to solve

Proposal

Further details

Challenges

Designs

Child items ...

Activity

Export licenses for ranges of versions

Problem to solve

Proposal

Further details

Challenges

Relates to

Activity