Skip to content

Improve query for projects in rebalancing service

Heinrich Lee Yu requested to merge improve-issue-rebalancing-query into master

What does this MR do and why?

Query everything from projects takes a long time and wastes a lot of memory. We only need the IDs so we only select this column.

Related to #416438 (comment 1478700889)

Updated queries

  1. Fetching all projects

  2. Fetching first project

    • Before:

      SELECT "projects".*
      FROM "projects"
      WHERE "projects"."namespace_id" IN (
        SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND (traversal_ids @> ('{6935877}'))
      )
      ORDER BY "projects"."id" ASC
      LIMIT 1

      (PG chooses a bad plan for this group and the query takes very long on postgres.ai)

    • After:

      SELECT "projects"."id", "projects"."namespace_id"
      FROM "projects"
      WHERE "projects"."namespace_id" IN (
        SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND (traversal_ids @> ('{6935877}'))
      )
      LIMIT 1

      https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/20512/commands/67218

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Heinrich Lee Yu

Merge request reports