Optimize catalog component usage tracking

What does this MR do and why?

This MR optimizes catalog component usage tracking to eliminate N+1 database queries that were causing performance issues when processing pipelines with many components.

Problem

The component usage tracking worker was executing N database queries for N components, performing individual lookups and inserts for each component in a pipeline. This created significant performance bottlenecks for pipelines using multiple catalog components.

Solution: Batch Processing Flow

This MR introduces a new optimized flow that processes components in batches:

1. Preload Component Projects (Single Query)

  • Load all component projects at once using Project.id_in(component_project_ids)
  • Preload catalog_resource associations to avoid additional queries
  • Index projects by ID for O(1) lookup

2. Separate Catalog vs Non-Catalog Components

  • Filter components upfront based on whether they have a catalog_resource
  • Track all components (catalog + non-catalog) for analytics
  • Process only catalog components for usage recording

3. Group Components by Project and SHA

  • Group catalog components by [project_id, sha] pairs
  • Enables batch lookups for components from the same project/version

4. Batch Component Lookups

  • New method: ComponentsProject#find_catalog_components(component_names)
  • Fetches multiple components in a single query per project/SHA combination
  • Replaces N individual find_catalog_component calls

5. Batch Usage Record Operations

  • Single fetch: Load all existing usage records in one query
  • Partition: Separate into records to update vs. records to insert
  • Bulk update: Use update_all for existing records
  • Bulk insert: Use insert_all for new records

Performance Impact

Before: N queries for N components (lookup + insert/update per component)

After: Fixed number of queries regardless of component count:

  • 1 query to load component projects
  • 1 query per unique project/SHA combination to load components
  • 1 query to fetch existing usage records
  • 1 query to bulk update existing records
  • 1 query to bulk insert new records

Changes Summary

  1. New batch lookup method: Ci::Catalog::ComponentsProject#find_catalog_components
  2. Refactored service: Ci::Components::Usages::CreateService now accepts array of components
  3. Optimized worker: Ci::Catalog::Resources::TrackComponentUsageWorker implements batch processing flow
  4. N+1 prevention: Added query recorder tests to prevent regressions

Database Review

Query 1: Bulk Update Existing Usage Records

Location: app/services/ci/components/usages/create_service.rb:68

Query:

UPDATE catalog_resource_component_last_usages
SET last_used_date = '2025-12-09'
WHERE catalog_resource_component_last_usages.id IN (1, 2, 3)

Execution Plan:

Update on catalog_resource_component_last_usages  (cost=0.15..5.51 rows=0 width=0)
  ->  Index Scan using catalog_resource_component_last_usages_pkey on catalog_resource_component_last_usages  (cost=0.15..5.51 rows=3 width=10)
        Index Cond: (id = ANY ('{1,2,3}'::bigint[]))

Query 2: Bulk Insert New Usage Records

Location: app/services/ci/components/usages/create_service.rb:71

Query:

INSERT INTO catalog_resource_component_last_usages (component_id, used_by_project_id, catalog_resource_id, component_project_id, last_used_date)
VALUES
  (100, 200, 300, 400, '2025-12-09'),
  (101, 201, 301, 401, '2025-12-09'),
  (102, 202, 302, 402, '2025-12-09')

Execution Plan:

Insert on catalog_resource_component_last_usages  (cost=0.00..0.05 rows=0 width=0)
  ->  Values Scan on "*VALUES*"  (cost=0.00..0.05 rows=3 width=44)

Query 3: Fetch Existing Usage Records

Location: app/services/ci/components/usages/create_service.rb:33

Query:

SELECT "catalog_resource_component_last_usages".*
FROM "catalog_resource_component_last_usages"
WHERE "catalog_resource_component_last_usages"."component_id" IN (100, 101, 102)
  AND "catalog_resource_component_last_usages"."used_by_project_id" = 200


Issue: Optimize catalog component lookups in CI pipeli... (#581360)

Edited by Avielle Wolfe

Merge request reports

Loading