Geo: Replicate AbuseReport uploads

What does this MR do and why?

This merge request adds support for replicating abuse report uploads across multiple GitLab instances using Geo replication.

The changes create a new database table called "abuse_report_upload_states" that tracks the verification status of abuse report files when they're copied between different GitLab servers. This includes fields to monitor when verification started, whether it succeeded or failed, and retry information if something goes wrong.

The code also adds new API endpoints and GraphQL queries that allow administrators to check the replication status of these abuse report uploads. This helps ensure that when users submit abuse reports with file attachments, those files are properly synchronized across all GitLab instances in a multi-server setup.

Additionally, the changes include database migration scripts to create the necessary table structure and indexes for efficient querying of replication status. The new functionality integrates with GitLab's existing Geo replication framework, which is used to keep multiple GitLab instances in sync for backup and performance purposes.

References

How to set up and validate locally

See !224164 (merged)

Database Queries

  • Selective Sync Disabled:

    • Raw SQL

      Click to expand
      SELECT
          "abuse_report_uploads".*
      FROM
          "abuse_report_uploads"
      WHERE
          "abuse_report_uploads"."id" BETWEEN 1 AND 10000;
    • Query Plan: https://explain.depesz.com/s/7eze

  • Selective Sync by Groups:

    • Raw SQL

      Click to expand
      SELECT
        "abuse_report_uploads".*
      FROM
        "abuse_report_uploads"
      WHERE
        "abuse_report_uploads"."id" BETWEEN 1 AND 10000
        AND "abuse_report_uploads"."organization_id" IN ( SELECT DISTINCT
                "namespaces"."organization_id"
            FROM
                "namespaces"
            WHERE
                "namespaces"."id" IN ( WITH RECURSIVE "base_and_descendants" AS (
      (
                            SELECT
                                "geo_node_namespace_links"."namespace_id" AS id
                            FROM
                                "geo_node_namespace_links"
                            WHERE
                                "geo_node_namespace_links"."geo_node_id" = 2)
                        UNION (
                            SELECT
                                "namespaces"."id"
                            FROM
                                "namespaces",
                                "base_and_descendants"
                            WHERE
                                "namespaces"."parent_id" = "base_and_descendants"."id"))
                        SELECT
                            "id"
                        FROM
                            "base_and_descendants" AS "namespaces"));
    • Query Plan: https://explain.depesz.com/s/TXcn

  • Selective Sync by Organizations:

    • Raw SQL

      Click to expand
      SELECT
          "abuse_report_uploads"."id"
      FROM
          "abuse_report_uploads"
      WHERE
          "abuse_report_uploads"."id" BETWEEN 1 AND 10000
          AND "abuse_report_uploads"."organization_id" IN (
              SELECT
                  "organizations"."id"
              FROM
                  "organizations"
                  INNER JOIN "geo_node_organization_links" ON "organizations"."id" = "geo_node_organization_links"."organization_id"
              WHERE
                  "geo_node_organization_links"."geo_node_id" = 2);
    • Query Plan: https://explain.depesz.com/s/RgDU

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Douglas Barbosa Alexandre

Merge request reports

Loading