Skip to content

Draft: POC: Sync EPSS data located in a dedicated EPSS bucket

Nick Ilieskou requested to merge nilieskou/sync_for_epss into master

What does this MR do and why?

This MR Assumes that we are using a separate bucket to store EPSS scores.

  • Adds a feature flag epss_ingestion
  • Add a cronjob for epss_ingestion
  • Add a sync worker for EPSS ingestion
  • Add EpssDataObject that stores EPSS data entries from the public bucket
  • Extend ObjectDataFabricator to handle EPSS data
  • Extend SyncConfiguration and Location class with EPSS related functionality. Unfortunately we still have some kind of misuse for the purl_type which is tightly coupled with the class but is not being used in EPSS.

How to set up and validate locally in offline mode

  1. Create new directory for advisories in $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/epss:

    mkdir -p $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/epss
  2. Add the following file

Path: v2/0/00000001.ndjson 

Content: 
{"cve_id":"CVE-2020-2304", "score":0.1}
{"cve_id":"CVE-2021-2304", "score":0.23}
{"cve_id":"CVE-2022-2304", "score":0.25}
  1. Open the rails console and start the sync process:

    PM_SYNC_IN_DEV=true rails c
    
    [1] pry(main)> Feature.enable(:epss_ingestion)
    
    [2] pry(main)> module PackageMetadata
      class MyEpssSyncWorker
        include ExclusiveLeaseGuard
    
        def lease_timeout
          5.minutes
        end
    
        def perform
          try_obtain_lease do
            SyncService.execute(data_type: 'epss', lease: exclusive_lease)
          end
        end
      end
    end
    
    [3] pry(main)> PackageMetadata::MyEpssSyncWorker.new.perform

How to set up and validate locally against a GCP bucket

  1. Create new directory for advisories in $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/epss:

    mkdir -p $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/epss
  2. Create a GCP bucket named <TEST_GCP_BUCKET> and insert the following structure

Path: 
epss/v2/0/00000001.ndjson 

Content: 
{"cve_id":"CVE-2020-2304", "score":0.1}
{"cve_id":"CVE-2021-2304", "score":0.23}
{"cve_id":"CVE-2022-2304", "score":0.25}
  1. Install the gsutil tool.

  2. Sync package advisory bucket using gsutil:

    gsutil -m rsync -r -d gs://<TEST_GCP_BUCKET> $GITLAB_RAILS_ROOT_DIR/vendor/package_metadata/epss
  3. Open the rails console and start the sync process:

    PM_SYNC_IN_DEV=true rails c
    
    [1] pry(main)> Feature.enable(:epss_ingestion)
    
    [2] pry(main)> module PackageMetadata
      class MyEpssSyncWorker
        include ExclusiveLeaseGuard
    
        def lease_timeout
          5.minutes
        end
    
        def perform
          try_obtain_lease do
            SyncService.execute(data_type: 'epss', lease: exclusive_lease)
          end
        end
      end
    end
    
    [3] pry(main)> PackageMetadata::MyEpssSyncWorker.new.perform

Similar MRs

Ingest advisory and affected package data to DB (!123149 - merged)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Nick Ilieskou

Merge request reports