Partial backfilling in partitioned project_daily_statistics

What does this MR do and why?

Epic: Source Code: Table Cleanup: Partition project_d... (&18879 - closed)

Issue: Partitioning: Initiate the project_daily_statis... (#560725 - closed)

This MR is part of implementing table partitioning for the project_daily_statistics table to address performance and scalability issues caused by its rapidly growing size. The table stores daily aggregated metrics for GitLab projects, and partitioning will improve query performance and enable more efficient data retention management.

  1. Implements a migration to backfill the partitioned project_daily_statistics table starting from records dated August 1, 2025
  2. Enhances the enqueue_partitioning_data_migration method to support partial backfilling by adding an optional batch_min_value parameter

Q: why August 1, 2025?

A:We partition the original table monthly. Since GitLab only retains project_daily_statistics data for 30 days, we'll backfill data starting from August 1.

References

Add partitioned copy for sent_notifications table

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Emma Park

Merge request reports

Loading