Skip to content

Geo: Improve performance of backfill queries

Problem to solve

Many backfill queries no longer rely on FDW due to work in &2851 (closed), and therefore some queries which were written to perform well with FDW should now be written without CTEs because it may improve performance. Also it should be more readable.

This issue is follow up of this discussion !32797 (comment 354651914).

Intended users

User experience goal

Improved performance of backfill queries.

Proposal

Remove CTEs from backfill queries where FDW is no longer used. We can start with just one case, as a small iteration.

  • Remove CTE from Packages::PackageFile.replicables_for_geo_node
  • Open follow up issue to do the same for the others

Further details

From !32797 (diffs, comment 354651914):

I found a sample query looks like:

WITH "restricted_packages" AS (
  SELECT "packages_packages"."id"
  FROM "packages_packages"
  WHERE "packages_packages"."project_id" IN (
    SELECT "projects"."id" FROM "projects"."namespace_id" = 9970))
SELECT "packages_package_files"."id"
FROM "restricted_packages"
INNER JOIN "packages_package_files"
ON "restricted_packages"."id" = "packages_package_files"."package_id"
WHERE "packages_package_files"."id" BETWEEN 1 AND 1000

with an explain plan here: https://explain.depesz.com/s/KzO

Is there a reason for using the CTE here? My concern is that if the list of projects is too large, the CTE will access/materialize a large data set. For example that query can be rewritten to not use a CTE:

SELECT packages_package_files.id
FROM packages_package_files
INNER JOIN packages_packages
ON packages_package_files.package_id = packages_packages.id
WHERE packages_packages.project_id IN (SELECT projects.id FROM projects WHERE namespace_id = 9970)
AND packages_package_files.id BETWEEN 1 AND 1000

with an explain here: https://explain.depesz.com/s/HNkO

Permissions and Security

Documentation

Availability & Testing

What does success look like, and how can we measure that?

What is the type of buyer?

Is this a cross-stage feature?

Links / references

Edited by Michael Kozono