SPIKE: Investigate deps.dev as a data source for license-db
Topic to Evaluate
deps.dev is a project which makes open source package data available via a BigQuery dataset and API. This can be a valuable data source to augment and, in some cases, replace existing license-db functionality.
Tasks prior to evaluation
-
Evaluate the data being made available by https://deps.dev/ and its suitability for use with license-db. -
Check that the BigQuery interface of deps.dev would allow License DB to fetch regular updates. &12515 (comment 1759547652)
Tasks to Evaluate
-
Check whether license and terms are compatible (CC-BY 4.0) (see legal issue). - A notice is required at data location: https://gitlab.com/gitlab-com/legal-and-compliance/-/issues/1930#note_1771979223
-
Compare and contrast data sources for current feeders vs deps.dev -
Data completeness -
Sanity check of representation of actual package registry data in the dataset: #439634 (comment 1794164170) -
Contrast top 105 packages for each package registry between curent dataset and deps.dev: #439634 (comment 1791875338) -
Spot check known problematic package registries in current dataset: #439634 (comment 1794164170) - Go seems to have 4x less packages and needs to be investigated further Spike: Assess golang package differences betwee... (#444231 - closed) • Nick Ilieskou • 17.0
-
Spot check known unknowns in current dataset: #439634 (comment 1797618145)
-
-
Freshness (recency of data) #439634 (comment 1791875338) -
Reliability (unknowns, missing classes of data) #439634 (comment 1794164170) - See go.
-
-
Limits (api, cost) -
Risks #439634 (comment 1794243482)
Task Contrast ease of access (i.e. replicating couchdb vs bigquery) moved to PoC: add cargo as a license data source (#443841 - closed) • Igor Frenkel • 16.11 • On track
Following 2 tasks moved out to SPIKE: Look at deps.dev non-license data sources (#444220) • Unassigned • Backlog:
- Compare and contrast advisory data (as above)
- Investigate other data that is made available
Timebox
3d
Interesting issues
Team
-
Add workflowplanning breakdown typefeature and the corresponding ~devops::<stage>and~group::<group>labels. -
Ping the PM and EM.
Edited by Igor Frenkel