Skip to content

Create vulnerabilities ES Index

What does this MR do and why?

Creates the Vulnerabilities ES index along with the document schema and default index settings.

The other MR was merged and reverted because of this reason.

Database

Preloading logic query plans:

How to test the index settings and the query against test data

  1. Partial text search test data is available here.
  2. partial_text_search_test.rb file has a hash where the custom mapping and query to be tested can be added for the test run. Add your custom index settings here and the query to test here.
  3. After including your custom index settings and query, run ruby partial_text_search_test.rb and it will run and list out the best index settings and query. Ideally the best setting and query should have 100% success rate without any false positives.
  4. Follow similar setting instructions for full_text_search_test.rb.

How to set up and validate locally

Setup

Seed vulnerabilities in local:

  1. Import the project from here into local using the import by url option.
  2. In the imported project, run pipeline on the master branch and allow the pipeline to complete. This will seed the vulnerabilities data.
  3. To populate the pm_cve_enrichment table with data for epss_scores field, follow the instructions on the readme.md file on the imported project.

Run the ES migration:

  1. Run the migration in Rails console Elastic::DataMigrationService[20250408180015].migrate.

Backfill ES index with documents manually:

  1. In Rails console run the below commands
Vulnerabilities::Read.all.each { |v| ::Elastic::ProcessBookkeepingService.track!(Search::Elastic::References::Vulnerability.new(v.vulnerability_id, "group_#{v.project.namespace.root_ancestor.id}")) }
  1. Run the bookkeeping command.
Elastic::ProcessBookkeepingService.new.execute

Validation steps:

  1. GET gitlab-development-vulnerabilities/_settings in Kibana Dev console or curl "http://localhost:9200/gitlab-development-vulnerabilities/_settings" should list the new index being created after running the migration command above.
  2. Find the name of the full index from the above request's response, lets say the index name from the response is gitlab-development-vulnerabilities-20250319-2109. Verify that the mappings are created successfully by the request GET gitlab-development-vulnerabilities/_mapping in Kibana or `curl "http://localhost:9200/gitlab-development-vulnerabilities/_mapping". It should look like the below response
{
  "gitlab-development-vulnerabilities-20250407-2006": {
    "mappings": {
      "dynamic": "strict",
      "_meta": {
        "created_by": "17.11.0-pre"
      },
      "properties": {
        "archived": {
          "type": "boolean"
        },
        "auto_resolved": {
          "type": "boolean"
        },
        "casted_cluster_agent_id": {
          "type": "long"
        },
        "cluster_agent_id": {
          "type": "text"
        },
        "created_at": {
          "type": "date"
        },
        "dismissal_reason": {
          "type": "short"
        },
        "epss_scores": {
          "type": "float"
        },
        "has_issues": {
          "type": "boolean"
        },
        "has_merge_request": {
          "type": "boolean"
        },
        "has_remediations": {
          "type": "boolean"
        },
        "has_vulnerability_resolution": {
          "type": "boolean"
        },
        "id": {
          "type": "long"
        },
        "identifier_names": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "normalizer": "lower_case_normalizer"
            },
            "ngram": {
              "type": "text",
              "analyzer": "identifier_ngram_analyzer",
              "search_analyzer": "standard"
            }
          },
          "analyzer": "identifier_pattern_analyzer"
        },
        "location_image": {
          "type": "text"
        },
        "project_id": {
          "type": "long"
        },
        "report_type": {
          "type": "short"
        },
        "resolved_on_default_branch": {
          "type": "boolean"
        },
        "scanner_external_id": {
          "type": "text"
        },
        "scanner_id": {
          "type": "long"
        },
        "schema_version": {
          "type": "short"
        },
        "severity": {
          "type": "short"
        },
        "state": {
          "type": "short"
        },
        "traversal_ids": {
          "type": "keyword"
        },
        "type": {
          "type": "keyword"
        },
        "updated_at": {
          "type": "date"
        },
        "uuid": {
          "type": "binary"
        },
        "vulnerability_id": {
          "type": "long"
        }
      }
    }
  }
}

Related to #515553 (closed)

Edited by Bala Kumar

Merge request reports

Loading