Skip to content

Backend: Denormalize released_at in catalog_resource_versions to improve query performance

Summary

In #429707 (closed), we updated the catalog resource Release logic to use the Versions association. This change required creating queries on the Ci::Catalog::Resources::Version model that JOIN with the releases table. This joining causes inefficient database query performance. We should denormalize the necessary releases columns and add them to catalog_resource_versions, which will allow us to avoid the JOIN statements all together. This will improve query performance.

Proposal

Denormalize catalog_resources_versions.released_at and then update the query plans in the Ci::Catalog::Resources::Version model.

We discussed in #432189 (comment 1666538958) that the fields we're syncing between catalog_resource_versions and releases have a low risk of going out of sync and a non-critical impact if it does happen. So we can rely on syncing via model callbacks.

Steps

  1. Create a migration to add released_at to catalog_resource_versions. The data type and constraints should match releases.released_at exactly.
  • Add index on catalog_resource_id and released_at.
  1. Add model callbacks to sync the columns when a new version is created and when releases.released_at is updated.
  2. Create a data migration to backfill the released_at column on existing versions.
  3. Update the query plans in the Ci::Catalog::Resources::Version model.

MR Implementation

Step Links
1 Add released_at column to catalog_resource_vers... (!136139 - merged)
2 Sync catalog_resource_versions.released_at with... (!141817 - merged)
3 Backfill catalog_resource_versions.released_at ... (!142255 - merged), Background migration finalization MR: Finalize batched migration BackfillCatalogResou... (!143114 - merged)
4 Update Ci catalog Version model to use denormal... (!143455 - merged)
Edited by Leaminn Ma