Skip to content

Geo - Does not sync expired job artifacts

What does this MR do?

Ignore expired job artifacts in Geo::JobArtifactRegistryFinder#find_registry_differences.

Query plans

These following query plans are from Geo on staging since we don't have Geo enabled in production.

  • Selective sync disabled:

    • Before:

      SELECT
      	"ci_job_artifacts"."id"
      FROM
      	"ci_job_artifacts"
      WHERE "ci_job_artifacts"."id" BETWEEN 1 AND 1000;

      Plan with execution: https://explain.depesz.com/s/EHmJ

    • After:

      SELECT
      	"ci_job_artifacts"."id"
      FROM
      	"ci_job_artifacts"
      WHERE (expire_at IS NULL
      	OR expire_at > '2020-06-22 23:18:12.809445')
      AND "ci_job_artifacts"."id" BETWEEN 1 AND 1000;

      Plan with execution: https://explain.depesz.com/s/bLaC

  • Selective sync disabled with files stored locally:

    • Before

      SELECT
      	"ci_job_artifacts"."id"
      FROM
      	"ci_job_artifacts"
      WHERE "ci_job_artifacts"."id" BETWEEN 1 AND 1000
      AND("ci_job_artifacts"."file_store" = 1 OR "ci_job_artifacts"."file_store" IS NULL)

      Plan with execution: https://explain.depesz.com/s/65Ol

    • After

      SELECT
      	"ci_job_artifacts"."id"
      FROM
      	"ci_job_artifacts"
      WHERE (expire_at IS NULL
      	OR expire_at > '2020-06-22 23:20:35.372104')
      AND "ci_job_artifacts"."id" BETWEEN 1 AND 1000
      AND("ci_job_artifacts"."file_store" = 1 OR "ci_job_artifacts"."file_store" IS NULL)

      Plan with execution: https://explain.depesz.com/s/7lB9

  • Selective sync enabled:

    • Before

      WITH "restricted_job_artifacts" AS (
      	SELECT
      		"ci_job_artifacts"."id"
      	FROM
      		"ci_job_artifacts"
      	WHERE
      		"ci_job_artifacts"."project_id" IN(
      			SELECT
      				"projects"."id" FROM "projects"
      			WHERE
      				"projects"."namespace_id" IN(WITH RECURSIVE "base_and_descendants" AS ((
      						SELECT
      							"geo_node_namespace_links"."namespace_id" AS id FROM "geo_node_namespace_links"
      						WHERE
      							"geo_node_namespace_links"."geo_node_id" = 100109)
      					UNION (
      						SELECT
      							"namespaces"."id" FROM "namespaces",
      							"base_and_descendants"
      						WHERE
      							"namespaces"."parent_id" = "base_and_descendants"."id"))
      				SELECT
      					"id" FROM "base_and_descendants" AS "namespaces"))
      )
      SELECT
      	"ci_job_artifacts"."id"
      FROM
      	"restricted_job_artifacts"
      	INNER JOIN "ci_job_artifacts" ON "restricted_job_artifacts"."id" = "ci_job_artifacts"."id"
      WHERE "ci_job_artifacts"."id" BETWEEN 1 AND 1000;

      Plan with execution: https://explain.depesz.com/s/zIBm

    • After

      WITH "restricted_job_artifacts" AS (
      	SELECT
      		"ci_job_artifacts"."id"
      	FROM
      		"ci_job_artifacts"
      	WHERE
      		"ci_job_artifacts"."project_id" IN(
      			SELECT
      				"projects"."id" FROM "projects"
      			WHERE
      				"projects"."namespace_id" IN(WITH RECURSIVE "base_and_descendants" AS ((
      						SELECT
      							"geo_node_namespace_links"."namespace_id" AS id FROM "geo_node_namespace_links"
      						WHERE
      							"geo_node_namespace_links"."geo_node_id" = 100109)
      					UNION (
      						SELECT
      							"namespaces"."id" FROM "namespaces",
      							"base_and_descendants"
      						WHERE
      							"namespaces"."parent_id" = "base_and_descendants"."id"))
      				SELECT
      					"id" FROM "base_and_descendants" AS "namespaces"))
      )
      SELECT
      	"ci_job_artifacts"."id"
      FROM
      	"restricted_job_artifacts"
      	INNER JOIN "ci_job_artifacts" ON "restricted_job_artifacts"."id" = "ci_job_artifacts"."id"
      WHERE (expire_at IS NULL
      	OR expire_at > '2020-06-22 23:14:32.748250')
      AND "ci_job_artifacts"."id" BETWEEN 1 AND 1000;

      Plan with execution: hhttps://explain.depesz.com/s/UpKg

  • Selective sync enabled with files stored locally:

    • Before

      WITH "restricted_job_artifacts" AS (
        SELECT
          "ci_job_artifacts"."id"
        FROM
          "ci_job_artifacts"
        WHERE
          "ci_job_artifacts"."project_id" IN(
            SELECT
              "projects"."id" FROM "projects"
            WHERE
              "projects"."namespace_id" IN(WITH RECURSIVE "base_and_descendants" AS ((
                  SELECT
                    "geo_node_namespace_links"."namespace_id" AS id FROM "geo_node_namespace_links"
                  WHERE
                    "geo_node_namespace_links"."geo_node_id" = 100109)
                UNION (
                  SELECT
                    "namespaces"."id" FROM "namespaces",
                    "base_and_descendants"
                  WHERE
                    "namespaces"."parent_id" = "base_and_descendants"."id"))
              SELECT
                "id" FROM "base_and_descendants" AS "namespaces"))
      )
      SELECT
        "ci_job_artifacts"."id"
      FROM
        "restricted_job_artifacts"
        INNER JOIN "ci_job_artifacts" ON "restricted_job_artifacts"."id" = "ci_job_artifacts"."id"
      WHERE ("ci_job_artifacts"."file_store" = 1
        OR "ci_job_artifacts"."file_store" IS NULL)
      AND "ci_job_artifacts"."id" BETWEEN 1 AND 1000;

      Plan with execution: https://explain.depesz.com/s/Zv0q

    • After

      WITH "restricted_job_artifacts" AS (
        SELECT
          "ci_job_artifacts"."id"
        FROM
          "ci_job_artifacts"
        WHERE
          "ci_job_artifacts"."project_id" IN(
            SELECT
              "projects"."id" FROM "projects"
            WHERE
              "projects"."namespace_id" IN(WITH RECURSIVE "base_and_descendants" AS ((
                  SELECT
                    "geo_node_namespace_links"."namespace_id" AS id FROM "geo_node_namespace_links"
                  WHERE
                    "geo_node_namespace_links"."geo_node_id" = 100109)
                UNION (
                  SELECT
                    "namespaces"."id" FROM "namespaces",
                    "base_and_descendants"
                  WHERE
                    "namespaces"."parent_id" = "base_and_descendants"."id"))
              SELECT
                "id" FROM "base_and_descendants" AS "namespaces"))
      )
      SELECT
        "ci_job_artifacts"."id"
      FROM
        "restricted_job_artifacts"
        INNER JOIN "ci_job_artifacts" ON "restricted_job_artifacts"."id" = "ci_job_artifacts"."id"
      WHERE ("ci_job_artifacts"."file_store" = 1
        OR "ci_job_artifacts"."file_store" IS NULL)
      AND(expire_at IS NULL OR expire_at > '2020-06-22 22:59:27.497704')
      AND "ci_job_artifacts"."id" BETWEEN 1 AND 1000;

      Plan with execution: https://explain.depesz.com/s/Rlko

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related issues

Closes #222495 (closed)

Edited by Michael Kozono

Merge request reports