Use project_uploads partition index for project export
What does this MR do and why?
Introduce Upload.for_project so callers that query Upload directly
get the right partition-pruning predicates (project_id and
model_type = 'Project'), and switch UploadsManager#each_uploader
to use it. Filtering by the project_id sharding key lets the
planner serve ORDER BY id LIMIT N (used by find_each) from
index_project_uploads_on_project_id_and_id index added in !237086 (merged).
Also move the avatar exclusion to Ruby.
Database review
Query plan: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/52396/commands/153844 (Uses a project with +2 millions uploads)
SELECT "uploads".*
FROM "uploads"
WHERE "uploads"."project_id" = 278964
AND "uploads"."model_type" = 'Project'
ORDER BY "uploads"."id" ASC
LIMIT 100 Limit (cost=0.57..94.98 rows=100 width=297) (actual time=0.075..0.179 rows=100 loops=1)
Buffers: shared hit=95
-> Index Scan using index_project_uploads_on_project_id_and_id on public.project_uploads uploads (cost=0.57..2299203.39 rows=2435348 width=297) (actual time=0.073..0.170 rows=100 loops=1)
Index Cond: (uploads.project_id = 278964)
Filter: (uploads.model_type = 'Project'::text)
Rows Removed by Filter: 0
Buffers: shared hit=95
Settings: effective_cache_size = '472585MB', jit = 'off', random_page_cost = '1.5', work_mem = '230MB', seq_page_cost = '4'```Time: 3.051 ms
- planning: 2.811 ms
- execution: 0.240 ms
- I/O read: N/A
- I/O write: N/A
Shared buffers:
- hits: 95 (~760.00 KiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0References
Screenshots or screen recordings
| Before | After |
|---|---|
How to set up and validate locally
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Edited by Rodrigo Tomonari