Spike: Efficient storage of redundant licenses for SBOM component versions
Time-box: 2 days
Topic to Evaluate
In #372212 (comment 1091206383) we identified that licenses tend to be the same across all versions of a SBOM component, and that we should leverage these redundancies to keep the Postgres DB as lean as possible. We identified two ways to remove these redundancies:
- Track license data of SBOM components along with ranges of versions sharing the same licenses.
- Track license data of SBOM component, and also track license data of SBOM component versions. The former is used as a default value, and the latter is only set for versions that differ from that default value.
Before moving forward with Update DB schema to store data imported from th... (#373163 - closed) and updating the DB schema, we need to identify the best option.
We'll compare the options using the following criteria:
- size of the DB tables used to track licenses of SBOM components
- feasibility of importing the External License Database
- feasibility of License Scanning
- speed/complexity when performing License Scanning
Tasks to Evaluate
-
Collect data on redundancies. -
How frequent are SBOM components whose licenses are the same across all versions? -
Ideally, get a distribution of the number of distinct sets of licenses per SBOM component.
-
-
Evaluate storing licenses of components w/ version range. -
Estimate size of DB tables. -
Check feasibility of license data import. -
Check feasibility of License Scanning. -
Estimate relative complexity of License Scanning.
-
-
Evaluate storing licenses of components (default), and licenses of versions (exceptions). -
Estimate size of DB tables. -
Check feasibility of license data import. -
Check feasibility of License Scanning. -
Estimate relative complexity of License Scanning.
-
-
Evaluate storing licenses of component versions, omitting redundancies. -
Estimate size of DB tables. -
Check feasibility of license data import. -
Check feasibility of License Scanning. -
Estimate relative complexity of License Scanning.
-
-
Choose one option. -
Update #373163 (closed) with the option that's been selected.
Risks and Implementation Considerations
/cc @brytannia
Auto-Summary 🤖
Discoto Usage
Points
Discussion points are declared by headings, list items, and single lines that start with the text (case-insensitive)
point:. For example, the following are all valid points:
#### POINT: This is a point* point: This is a point+ Point: This is a point- pOINT: This is a pointpoint: This is a **point**Note that any markdown used in the point text will also be propagated into the topic summaries.
Topics
Topics can be stand-alone and contained within an issuable (epic, issue, MR), or can be inline.
Inline topics are defined by creating a new thread (discussion) where the first line of the first comment is a heading that starts with (case-insensitive)
topic:. For example, the following are all valid topics:
# Topic: Inline discussion topic 1## TOPIC: **{+A Green, bolded topic+}**### tOpIc: Another topicQuick Actions
Action Description /discuss sub-topic TITLECreate an issue for a sub-topic. Does not work in epics /discuss link ISSUABLE-LINKLink an issuable as a child of this discussion
Last updated by this job
-
TOPIC Storing licenses for versions where licenses change #374901 (comment 1110980868)
- efficient storage #374901 (comment 1111924326)
- only needs comparison of versions #374901 (comment 1111928145)
- simple to evaluate #374901 (comment 1111939674)
- Size of DB table #374901 (comment 1111983450)
- Import #374901 (comment 1117940020)
- License Scanning #374901 (comment 1117957521)
- Total size of DB tables #374901 (comment 1118718916)
-
TOPIC Storing licenses of components with version range #374901 (comment 1111873025)
- need for a version range syntax #374901 (comment 1111929635)
- complex upserts when the licenses change #374901 (comment 1111932712)
-
TOPIC Storing licenses of components, and licenses of outliner versions #374901 (comment 1111895166)
- Too many records when multiple license sets #374901 (comment 1111918356)
- Too many inserts to track new versions with non-default licenses #374901 (comment 1111918356)
- Default not suitable to versions not yet listed #374901 (comment 1111918356)
- TOPIC Accuracy when storing licenses of versions introducing a change #374901 (comment 1115937001)
- TOPIC Composite primary keys #374901 (comment 1121926442)
- TOPIC No compression for project-specific license data #374901 (comment 1122314183)
Discoto Settings
---
summary:
max_items: -1
sort_by: created
sort_direction: ascending
See the settings schema for details.