Change data format of exported advisories

Why are we doing this work

advisory-exporter currently exports advisory data according to the Gemnasium DB schema. However the Rails backend expects data in a different format. In order to have a successful E2E advisory ingestion we should export data in the right format.

Relevant links

Non-functional requirements

See https://gitlab.com/gitlab-org/security-products/license-db/deployment/-/blob/main/docs/fullstack_development.md

  • Documentation:
  • Performance:
  • Testing:

Data format

Below you can see the current format of a CVE:

Current situation
{
    "date": "2023-06-07",
    "urls": [
      "https://nvd.nist.gov/vuln/detail/CVE-2023-32067",
      "https://github.com/c-ares/c-ares/releases/tag/cares-1_19_1",
      "https://github.com/c-ares/c-ares/security/advisories/GHSA-9g78-jv2r-p7vc",
      "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/B5Z5XFNXTNPTCBBVXFDNZQVLLIE6VRBY/",
      "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/UBFWILTA33LOSV23P44FGTQQIDRJHIY7/"
    ],
    "uuid": "559c87d6-cabd-4ed9-ad29-0cbcd31a7954",
    "title": "Uncontrolled Resource Consumption",
    "cvss_v3": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
    "cwe_ids": [
      "CWE-1035",
      "CWE-400",
      "CWE-937"
    ],
    "pubdate": "2023-05-25",
    "solution": "Upgrade to version 1.19.1 or above.",
    "identifier": "CVE-2023-32067",
    "description": "c-ares is an asynchronous resolver library. c-ares is vulnerable to denial of service. If a target resolver sends a query, the attacker forges a malformed UDP packet with a length of 0 and returns them to the target resolver. The target resolver erroneously interprets the 0 length as a graceful shutdown of the connection. This issue has been patched in version 1.19.1.",
    "identifiers": [
      "CVE-2023-32067",
      "GHSA-9g78-jv2r-p7vc"
    ],
    "not_impacted": "All versions starting from 1.19.1",
    "package_slug": "conan/c-ares",
    "affected_range": "<1.19.1",
    "fixed_versions": [
      "1.19.1"
    ],
    "affected_versions": "All versions before 1.19.1"
  }
  
Desired situation
{
  "advisory": {
      "id": "559c87d6-cabd-4ed9-ad29-0cbcd31a7954",
      "source": "glad",
      "title": "Uncontrolled Resource Consumption",
      "description": "c-ares is an asynchronous resolver library. c-ares is vulnerable to denial of service. If a target resolver sends a query, the attacker forges a malformed UDP packet with a length of 0 and returns them to the target resolver. The target resolver erroneously interprets the 0 length as a graceful shutdown of the connection. This issue has been patched in version 1.19.1.",
      "cvss_v3": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H",
      "cvss_v2": "",
      "published_date": "2023-05-25",
      "urls": [
        "https://nvd.nist.gov/vuln/detail/CVE-2023-32067",
        "https://github.com/c-ares/c-ares/releases/tag/cares-1_19_1",
        "https://github.com/c-ares/c-ares/security/advisories/GHSA-9g78-jv2r-p7vc",
        "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/B5Z5XFNXTNPTCBBVXFDNZQVLLIE6VRBY/",
        "https://lists.fedoraproject.org/archives/list/package-announce@lists.fedoraproject.org/message/UBFWILTA33LOSV23P44FGTQQIDRJHIY7/"
      ],
      "identifiers": [
          {
            "type": "cve",
            "name": "CVE-2018-1000620",
            "value": "CVE-2018-1000620",
            "url": "https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-1000620"
          },
          {
            "type": "ghsa",
            "name": "GHSA-9g78-jv2r-p7vc",
            "value": "GHSA-9g78-jv2r-p7vc",
            "url": "https://github.com/advisories/GHSA-9g78-jv2r-p7vc"
          },
        {
            "type": "cwe",
            "name": "CWE-1035",
            "value": "CWE-1035",
            "url": "https://cwe.mitre.org/data/definitions/CWE-1035.html"
          },
          {
            "type": "cwe",
            "name": "CWE-400",
            "value": "CWE-400",
            "url": "https://cwe.mitre.org/data/definitions/CWE-400.html"
          },
          {
            "type": "cwe",
            "name": "CWE-937",
            "value": "CWE-937",
            "url": "https://cwe.mitre.org/data/definitions/CWE-937.html"
          }
      ]
  },
  "packages": [
     {
       "name": "c-ares",
       "purl_type": "conan",
       "affected_range": "<1.19.1",
       "solution": "Upgrade to version 1.19.1 or above.",
       "fixed_versions": [
        "1.19.1"
      ]
     }
  ]
}

The new advisory schema is composed of two fields:

  • advisory: Contains the main information about the advisory. More specifically it contains the following fields:
    • id: refers to the uuid of the advisory.
    • source: this will be set to glad representing the Gitlab Advisory DB
    • title: the title of the advisory
    • description: the description field of the advisory
    • cvss_v2, cvss_v3: these are optional fields. If empty they can be ommited
    • published_date: this field maps to the pubdate field of the advisory.
    • urls: maps to the url field in the advisory
    • identifiers: this is an array containing the identifiers array and the cwe_ids array from the advisory. Both identifiers and cwe_ids are array of strings in the advisory. These string values need to become identifier objects according to this schema. We can reuse this code to generate these objects.
  • packages: this field contains information about the advisory packages. For this issue we have only gitlab advisory db as a source. That being said packages is an array but it will contain only one package. More specifically it contains the following fields:
    • affected_range: maps to the affected_range field from the advisory
    • solution: maps to the solution field from the advisory
    • fixed_versions: maps to the fixed_versions field from the advisory
    • name: the name of the package. This can be derived from the package_slug field of the advisory. The package_slug is in the form of purl_type/package_name.
    • purl_type: the purl_type of the package. This can be derived from the package_slug field of the advisory which is in the form of purl_type/package_name. We need to be a bit careful with the purl_type since the Gitlab Advisory DB uses different types for some package managers from what the Rails Backend is expecting. We should make sure that we use the value in the first column from the following table. For example if we have a go advisory we should have a purl_type of go.
PURL gemnasium-db
conan conan
gem gem
golang go
maven maven
npm npm
nuget nuget
composer packagist
pypi pypi

Implementation Plan

  • Create a function that receives an advisory as input and provides the expected format as the output
  • Unit tests for this function
  • Document the new structure of the data in the Readme file
  • Add a contributing file according to this comment
Edited by Nick Ilieskou