Integration settings not propagated to projects under a large group tree

What does this MR do?

Optimizes the query used during integration propagation to find projects without a given integration. The original query was timing out on large group hierarchies (e.g. gitlab-org with ~6,200 projects).

Problem

When propagating integrations to projects under a group, the non_archived scope runs a correlated NOT EXISTS subquery on namespace_settings for every project row to check if any ancestor group is archived. For gitlab-org that's ~6,235 scans, almost always returning nothing since group archiving is rarely used. This was causing 15s statement timeouts in Sidekiq.

Solution

Introduced two new scopes behind the optimize_propagate_integration_projects feature flag:

  • Group.excluding_self_and_ancestors_archived, uses the && (overlap) operator to check archived ancestors in a single scan instead of per-row
  • Project.without_integration_prefiltered, like without_integration but skips the non_archived scope, since the caller already handles the archived filtering upstream via the group scope

The service method create_integration_for_projects_without_integration_belonging_to_group now combines both filters into one namespace subquery when the flag is enabled, so PostgreSQL resolves everything in a single InitPlan.

No changes to existing scopes or the instance-level propagation path.

Feature flag

optimize_propagate_integration_projects, actor-based on integration.group, disabled by default.

Query plan comparison

Tested against gitlab-org namespace (9970) with ~1,364 groups and ~6,240 projects, using Integrations::Jira.

Both queries ran back-to-back in the same DB Lab session (original first, then optimized).

Original query

The current query uses non_archived which chains a correlated NOT EXISTS subquery on namespace_settings. This checks if any ancestor group is archived by scanning namespace_settings.namespace_id = ANY(namespaces.traversal_ids) per project row.

For gitlab-org, that's 6,235 index scans on namespace_settings — almost always returning nothing since group archiving is rarely used.

Total time: 229.153 ms
Buffer hits: 91,435
Disk reads: 15,461
namespace_settings scans: 6,235 (once per project row)
LEFT JOIN namespaces PK lookups: 6,235
Full plan
Gather  (cost=1111.46..34989.70 rows=581 width=4) (actual time=3.154..223.769 rows=6227 loops=1)
  Workers Planned: 1
  Workers Launched: 1
  Buffers: shared hit=91435 read=15461 dirtied=290
  ->  Nested Loop Anti Join  (cost=111.46..33931.60 rows=342 width=4) (actual time=1.817..216.115 rows=3114 loops=2)
        ->  Nested Loop Left Join  (cost=111.18..33726.80 rows=343 width=32) (actual time=1.738..205.524 rows=3118 loops=2)
              ->  Nested Loop Anti Join  (cost=110.61..33503.06 rows=343 width=8) (actual time=1.646..184.540 rows=3118 loops=2)
                    ->  Nested Loop  (cost=110.05..33294.63 rows=359 width=8) (actual time=1.545..132.594 rows=3126 loops=2)
                          ->  Parallel Bitmap Heap Scan on namespaces namespaces_1  (rows=682 loops=2)
                          ->  Index Scan on projects  (rows=5 loops=1364)
                    ->  Index Only Scan on integrations  (rows=0 loops=6251)
              ->  Index Scan using namespaces_pkey on namespaces  (rows=1 loops=6235)
        ->  Index Only Scan on namespace_settings  (rows=0 loops=6235)
              Index Cond: (namespace_id = ANY (namespaces.traversal_ids))

Optimized query

The new query moves the archived check into the namespace subquery using the &&. PostgreSQL collects all 9,411 archived namespace IDs once via an InitPlan (2.4ms), then filters namespaces in a single bitmap scan, no per-row correlated subquery, no LEFT JOIN.

The total time on postgres.ai is 3.6s, but this is dominated by cold cache disk I/O (read=13998 pages). In production with warm buffers, this drops significantly. The key improvement is structural: the original query ran ~6,240 correlated NOT EXISTS subqueries (one per project row), while this query runs a single InitPlan and a straightforward nested loop with index scans.

Total time: 3.6s (cold cache on postgres.ai)
Buffer hits: 32,245
Disk reads: 13,998 (cold cache)
InitPlan (archived IDs): 2.4ms, runs once
Projects returned: 6,227
Full Query
 SELECT
  projects.id,
  projects.name,
  projects.path,
  projects.description,
  projects.created_at,
  projects.updated_at,
  projects.creator_id,
  projects.namespace_id,
  projects.last_activity_at,
  projects.import_url,
  projects.visibility_level,
  projects.archived,
  projects.avatar,
  projects.merge_requests_template,
  projects.star_count,
  projects.merge_requests_rebase_enabled,
  projects.import_type,
  projects.import_source,
  projects.approvals_before_merge,
  projects.reset_approvals_on_push,
  projects.merge_requests_ff_only_enabled,
  projects.issues_template,
  projects.mirror,
  projects.mirror_last_update_at,
  projects.mirror_last_successful_update_at,
  projects.mirror_user_id,
  projects.shared_runners_enabled,
  projects.runners_token,
  projects.build_allow_git_fetch,
  projects.build_timeout,
  projects.mirror_trigger_builds,
  projects.pending_delete,
  projects.public_builds,
  projects.last_repository_check_failed,
  projects.last_repository_check_at,
  projects.only_allow_merge_if_pipeline_succeeds,
  projects.has_external_issue_tracker,
  projects.repository_storage,
  projects.repository_read_only,
  projects.request_access_enabled,
  projects.has_external_wiki,
  projects.ci_config_path,
  projects.lfs_enabled,
  projects.description_html,
  projects.only_allow_merge_if_all_discussions_are_resolved,
  projects.repository_size_limit,
  projects.printing_merge_request_link_enabled,
  projects.auto_cancel_pending_pipelines,
  projects.service_desk_enabled,
  projects.cached_markdown_version,
  projects.delete_error,
  projects.last_repository_updated_at,
  projects.disable_overriding_approvers_per_merge_request,
  projects.storage_version,
  projects.resolve_outdated_diff_discussions,
  projects.remote_mirror_available_overridden,
  projects.only_mirror_protected_branches,
  projects.pull_mirror_available_overridden,
  projects.jobs_cache_index,
  projects.external_authorization_classification_label,
  projects.mirror_overwrites_diverged_branches,
  projects.pages_https_only,
  projects.external_webhook_token,
  projects.packages_enabled,
  projects.merge_requests_author_approval,
  projects.pool_repository_id,
  projects.runners_token_encrypted,
  projects.bfg_object_map,
  projects.detected_repository_languages,
  projects.merge_requests_disable_committers_approval,
  projects.require_password_to_approve,
  projects.max_pages_size,
  projects.max_artifacts_size,
  projects.pull_mirror_branch_prefix,
  projects.remove_source_branch_after_merge,
  projects.marked_for_deletion_at,
  projects.marked_for_deletion_by_user_id,
  projects.autoclose_referenced_issues,
  projects.suggestion_commit_message,
  projects.project_namespace_id,
  projects.hidden,
  projects.organization_id
FROM projects
WHERE (NOT EXISTS (
  SELECT 1
  FROM integrations
  WHERE integrations.project_id = projects.id
    AND integrations.type_new = 'Integrations::Jira'
))
AND projects.pending_delete = FALSE
AND projects.archived = FALSE
AND projects.namespace_id IN (
  SELECT namespaces.id
  FROM namespaces
  WHERE namespaces.type = 'Group'
    AND (traversal_ids @> ('{9970}'))
    AND (NOT (namespaces.traversal_ids::bigint[] && ARRAY(
      SELECT namespace_settings.namespace_id
      FROM namespace_settings
      WHERE namespace_settings.archived = TRUE
    )::bigint[]))
)
Full plan
 Gather  (cost=1454.11..34622.71 rows=577 width=845) (actual time=11.077..3623.633 rows=6227 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Buffers: shared hit=32245 read=13998 dirtied=290
   WAL: records=298 fpi=290 bytes=2207154
   InitPlan 1
     ->  Index Only Scan using index_namespace_settings_on_namespace_id_where_archived_true on public.namespace_settings  (cost=0.29..343.51 rows=8770 width=4) (actual time=0.022..2.416 rows=9411 loops=1)
           Heap Fetches: 833
           Buffers: shared hit=7603
   ->  Nested Loop Anti Join  (cost=110.61..33221.50 rows=339 width=845) (actual time=6.807..3610.676 rows=3114 loops=2)
         Buffers: shared hit=24642 read=13998 dirtied=290
         WAL: records=298 fpi=290 bytes=2207154
         ->  Nested Loop  (cost=110.05..33015.38 rows=355 width=845) (actual time=4.987..2476.748 rows=3122 loops=2)
               Buffers: shared hit=4663 read=8988 dirtied=289
               WAL: records=297 fpi=289 bytes=2199589
               ->  Parallel Bitmap Heap Scan on public.namespaces  (cost=109.48..1788.16 rows=614 width=4) (actual time=2.079..304.777 rows=678 loops=2)
                     Filter: (NOT ((namespaces.traversal_ids)::bigint[] && ((InitPlan 1).col1)::bigint[]))
                     Rows Removed by Filter: 4
                     Buffers: shared hit=186 read=1215 dirtied=25
                     WAL: records=25 fpi=25 bytes=199417
                     ->  Bitmap Index Scan using index_namespaces_on_traversal_ids_for_groups  (cost=0.00..109.22 rows=1054 width=0) (actual time=1.596..1.596 rows=1364 loops=1)
                           Index Cond: (namespaces.traversal_ids @> '{9970}'::integer[])
                           Buffers: shared hit=59 read=2
               ->  Index Scan using index_projects_on_namespace_id_and_repository_size_limit on public.projects  (cost=0.56..50.53 rows=33 width=845) (actual time=1.079..3.201 rows=5 loops=1356)
                     Index Cond: (projects.namespace_id = namespaces.id)
                     Filter: ((NOT projects.pending_delete) AND (NOT projects.archived))
                     Rows Removed by Filter: 0
                     Buffers: shared hit=4477 read=7773 dirtied=264
                     WAL: records=272 fpi=264 bytes=2000172
         ->  Index Only Scan using index_integrations_on_project_id_and_type_new_unique on public.integrations  (cost=0.56..0.58 rows=1 width=4) (actual time=0.362..0.362 rows=0 loops=6243)
               Index Cond: ((integrations.project_id = projects.id) AND (integrations.type_new = 'Integrations::Jira'::text))
               Heap Fetches: 2
               Buffers: shared hit=19979 read=5010 dirtied=1
               WAL: records=1 fpi=1 bytes=7565
Settings: random_page_cost = '1.5', seq_page_cost = '4', effective_cache_size = '472585MB', work_mem = '230MB', jit = 'off'

Summary

We're seeing this in production (_id :AZxRFL2Kw2iKiDdYSb-1), enabling Jira on a group with ~1,364 subgroups and ~6,240 projects hits the 15s statement timeout due to a correlated NOT EXISTS subquery on namespace_settings that runs per project row.

The fix (behind optimize_propagate_integration_projects FF) replaces that with the && overlap operator, PostgreSQL collects archived namespace IDs once as an InitPlan, then filters in a single pass.

Original Optimized
Total time 207ms 181ms
Buffer hits 105,489 46,215
Disk reads 1,378 0
namespace_settings scans 6,235 1 (InitPlan)

Related to #577221

Edited by Rez

Merge request reports

Loading