Fix partition missing error in project_daily_statistics backfill
What does this MR do and why?
Related: PostgreSQL error when upgrading from 18.5.3 -> ... (#582526 - closed) BackfillPartitionedProjectDailyStatistics error... (#582512 - closed)
This MR fixes partition errors when Self-managed/Dedicated customers upgrade to GitLab 18.5
Problem:
- Migration has hardcoded START_DATE = Date.new(2025, 8, 1)
- Model has retain_for: 3.months
- For GitLab.com (upgraded Aug): August within 3 months, works
- For self-managed (upgrading Dec): August 4 months old, partition deleted by partition manager
- Backfill tries to copy August data -> ERROR: no partition found
Root cause: Hardcoded START_DATE assumes controlled rollout timing, breaks for self-managed deployments upgrading months later.
Solution (two-part fix):
-
Dynamic START_DATE: Use 3.months.ago.beginning_of_month instead of hardcoded date to respect retention policy at migration time
-
Filter in background job: Skip records outside retention window during backfill to handle already-enqueued jobs on affected instances
Query Plan
After the swap those table names have been changed. So, for the query plan I used project_daily_statistics_archived
`project_daily_statistics` --> `project_daily_statistics_archived`,
`project_daily_statistics_b8088ecbd2` --> `project_daily_statistics`
Click to expand: query plan
INSERT INTO project_daily_statistics (id, project_id, fetch_count, date)
SELECT id, project_id, fetch_count, date
FROM project_daily_statistics_archived
WHERE project_daily_statistics_archived.id IN (
SELECT project_daily_statistics_archived.id
FROM project_daily_statistics_archived
WHERE project_daily_statistics_archived.id BETWEEN 4343208907 AND 4343210907
LIMIT 1000
)
AND project_daily_statistics_archived.date >= '2025-09-01'
LIMIT 1000
FOR UPDATE
ON CONFLICT (id, date) DO NOTHING
ModifyTable on public.project_daily_statistics (cost=50.98..3659.43 rows=0 width=0) (actual time=11.560..11.564 rows=0 loops=1)
Buffers: shared hit=12291
WAL: records=1000 fpi=0 bytes=54000
I/O Timings: read=0.000 write=0.000
-> Subquery Scan on *SELECT* (cost=50.98..3659.43 rows=201 width=24) (actual time=1.520..7.066 rows=1000 loops=1)
Buffers: shared hit=7176
WAL: records=1000 fpi=0 bytes=54000
I/O Timings: read=0.000 write=0.000
-> Limit (cost=50.98..3657.42 rows=201 width=58) (actual time=1.519..6.883 rows=1000 loops=1)
Buffers: shared hit=7176
WAL: records=1000 fpi=0 bytes=54000
I/O Timings: read=0.000 write=0.000
-> LockRows (cost=50.98..3657.42 rows=201 width=58) (actual time=1.518..6.754 rows=1000 loops=1)
Buffers: shared hit=7176
WAL: records=1000 fpi=0 bytes=54000
I/O Timings: read=0.000 write=0.000
-> Nested Loop (cost=50.98..3655.41 rows=201 width=58) (actual time=1.489..5.472 rows=1000 loops=1)
Buffers: shared hit=5313
I/O Timings: read=0.000 write=0.000
-> HashAggregate (cost=50.41..60.41 rows=1000 width=40) (actual time=1.463..1.762 rows=1000 loops=1)
Group Key: "ANY_subquery".id
Buffers: shared hit=313
I/O Timings: read=0.000 write=0.000
-> Subquery Scan on ANY_subquery (cost=0.57..47.91 rows=1000 width=40) (actual time=0.033..1.162 rows=1000 loops=1)
Buffers: shared hit=313
I/O Timings: read=0.000 write=0.000
-> Limit (cost=0.57..37.91 rows=1000 width=8) (actual time=0.023..0.943 rows=1000 loops=1)
Buffers: shared hit=313
I/O Timings: read=0.000 write=0.000
-> Index Only Scan using project_daily_statistics_archived_pkey on public.project_daily_statistics_archived project_daily_statistics_archived_1 (cost=0.57..71.28 rows=1894 width=8) (actual time=0.023..0.854 rows=1000 loops=1)
Index Cond: ((project_daily_statistics_archived_1.id >= '4343208907'::bigint) AND (project_daily_statistics_archived_1.id <= '4343210907'::bigint))
Heap Fetches: 137
Buffers: shared hit=313
I/O Timings: read=0.000 write=0.000
-> Index Scan using project_daily_statistics_archived_pkey on public.project_daily_statistics_archived (cost=0.57..3.60 rows=1 width=26) (actual time=0.003..0.003 rows=1 loops=1000)
Index Cond: (project_daily_statistics_archived.id = "ANY_subquery".id)
Filter: (project_daily_statistics_archived.date >= '2025-09-01'::date)
Rows Removed by Filter: 0
Buffers: shared hit=5000
I/O Timings: read=0.000 write=0.000
Settings: random_page_cost = '1.5', seq_page_cost = '4', work_mem = '100MB', effective_cache_size = '472585MB', jit = 'off'
Time: 13.934 ms
- planning: 2.021 ms
- execution: 11.913 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
filter_sub_batch_content method scans by ID first and then apply date as we expected. 0 records were written because all those records already exist in the project_daily_statistics table.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.