Skip to content

Create an EPSS ingestion service

Introduction

The flow of package_metadata on the GitLab side is:

  1. Cronjob executes the relevant data type worker (licenses, advisories, epss).
  2. The worker runs the SyncService which handles the package_metadata flow for each purl type or for epss
  3. SyncService retrieves a SyncConfiguration for the relevant epss.
  4. SyncService uses the relevant connector (offline or GCP) to iterate over all new files (chunks) in the bucket since the last checkpoint.
  5. SyncService executes IngestionService for the given data type.
  6. The IngestionService runs a set of IngestionTask.
  7. Each IngestionTask parses and upserts the given data.
  8. The checkpoint is updated to reflect that we have progressed and data has been ingested.
  9. Continue until all data has been inserted or a stop signal is received.

This issue is responsible for implementing the ingestion part of the whole flow.

Implementation Plan

Test! You may create a CVE Enrichment object in ee/spec/factories/package_metadata similarly to ee/spec/factories/package_metadata/advisory_data_objects.rb.

  • Implement ee/spec/services/package_metadata/ingestion/cve_enrichment/cve_enrichment_ingestion_task_spec.rb similarly to nearby tests.
  • Implement ee/spec/services/package_metadata/ingestion/cve_enrichment/ingestion_service_spec.rb
Edited by Yasha Rise