Ingest malicious advisories in PMDB
## Executive Summary **Malicious packages pose an immediate and growing threat to software supply chains.** Unlike vulnerabilities that may remain dormant for months or years, malicious packages cause direct harm upon ingestion. Common attack vectors include credential theft, data exfiltration, botnet deployment, and database corruption. Users are particularly vulnerable because these packages often originate from trusted repositories like npm or rubygems. **This epic focuses on integrating malicious advisories into PMDB** from the ~"group::vulnerability research" team's private malicious package repository. The ingested data will be made available through a private bucket, accessible exclusively to ~"GitLab Premium" and ~"GitLab Ultimate" users who have purchased the supply chain attack protection add-on. This new offering will enhance existing subscriptions with comprehensive malware advisory coverage and broader supply chain security capabilities. #### Engineering Assessment There are several parts involved in this epic including: | Component | Description | Group | |-----------|-------------|-------| | PMDB | Ingest malware advisories from the private repository and export to a GCP private bucket | ~"group::composition analysis" | | Private Malicious Advisories Git Repo | Centralized source of malware advisory data | ~"group::vulnerability research" | | GitLab Rails App | Sync malware advisory data and surface as vulnerabilities in the platform | ~"group::security insights" | | UI | Display malware advisory information in the Secure tab for Ultimate users | ~"group::security insights" | | Auth | Gate malware advisory access to add-on subscribed users only | TBD | We need to extend all of these components to handle malware advisories. On the table above you don't see the effort required for authenticating users. In this epic we will work on the PMDB component. ### Proposal * Malicious dependencies are stored in a private Malicious Advisories repo * PMDB will check on an hourly rate if there are any new malicious advisories added or updated and ingest them * PMDB will export the malicious advisories in a private bucket * ~"group::composition analysis" will provide security credentials (like a service account) to the team that will require to generate tokens for accessing the private bucket. ![image.png](/uploads/53da95142fe00775eae1a36b04eac44d/image.png){width="900" height="361"} ADR MR: https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18229 ### Requirements * We should be able to manage on average 500 advisories per day we spikes of up to 100.000 advisories per day (https://gitlab.com/gitlab-org/gitlab/-/issues/584273#note_3030503540). For a projection of what we need to handle please check out https://gitlab.com/gitlab-org/gitlab/-/work_items/588462+ * VR malware git repo will have the [following structure](https://gitlab.com/gitlab-org/secure/vulnerability-research/pocs/malwares/) * A single GLAM malware advisory contains exactly one affected package. Upstream advisories affecting multiple packages are split into separate GLAM advisories. GLAM malware advisories are [marked as withdrawn](https://ossf.github.io/osv-schema/#withdrawn-field) when either: (1) the upstream advisory is marked as withdrawn, or (2) the upstream advisory is removed from the source. Withdrawn advisories remain in the database for historical tracking. * Malware advisories will use a unique GitLab identifier like `GLAM-YYYY-MM-NNNNN` https://gitlab.com/gitlab-org/gitlab/-/work_items/588568 * We need a unique identifier for each malware advisory. We can use the identifier above but we need to update the GitLab SBOM vulnerability Scanner . See https://gitlab.com/groups/gitlab-org/-/epics/20538#note_3033085156 * Costs should be taken into account. We should consider possibly a v3, using CloudFlare for the bucket or v2 and updating how Rails syncs data. ### Dependencies For the PMDB component to be extended for malicious advisories we depend on ~"group::vulnerability research" team to provide us with a private git repo /cc @dabeles @dbolkensteyn ### DRIs * Engineer: @nilieskou * EM: @nilieskou * PM: @joelpatterson | @ashalem * Engineering Owner: @rvider
epic