Skip to content

Spike: ossf scorecard in the pmdb corpus

Summary

Adding the ossf scorecard to pmdb would help to enrich the user experience when evaluating their dependencies by allowing them to see quality and health of the related projects.

Proposal

Look at deps.dev and openssf bigquery datasets and evaluate whether there is enough data available to link a GitLab project's dependencies to an OSS project.

Note: this will not look at the UX of showing the score in the Dependency List, merely the feasibility of fetching and storing the dependency data.

Tasks

  • Evaluate whether there is a usable dataset for our needs (e.g. offline data)
  • Evaluate licensing for this dataset
  • Evaluate whether enough information exists to show a scorecard value in the Dependency List
  • Look at the amount of data that will be processed by pmdb and by the GitLab instance
  • Look at query models for fetching scorecard data (e.g. whole dataset or deltas)

Completion criteria

  • Describe relevant scorecard parameters
  • Describe dataset
  • Recommend bigquery query to fetch ossf data
  • Table for which purl types data can be fetched
  • Approximate feeder, interfacer, database, exporter changes
  • What exported data will look like at rest (in gcp storage)
  • Changes to GitLab instance database needed to store this data
  • Note any other useful data
Edited by Igor Frenkel