Partition and reduce size of merge_request_diff_files
`merge_request_diff_files` will be partitioned by the `merge_request_diff_id` value. As part of this work, we will also be addressing items associated with https://gitlab.com/groups/gitlab-org/-/epics/16881+, through implementing https://gitlab.com/gitlab-org/gitlab/-/issues/557745+. Much of this work can occur in parallel, as we have code changes in the ActiveRecord model to support `*_path` deduplication that can be built into the application while we set-up and then backfill the partitioned table. ## Plan ### Prepare Partition * [x] [Create new `merge_request_diff_files` partition table](https://gitlab.com/gitlab-org/gitlab/-/issues/561368) * Initial MR: (https://gitlab.com/gitlab-org/gitlab/-/merge_requests/199297) * Remove `NOT NULL` constraint from `new_path` in partitioned table * In partition trigger, set `new_path = nil if old_path == new_path` * [x] https://gitlab.com/gitlab-org/gitlab/-/issues/559142 Update existing `merge_request_diff_files` table * [x] Remove `NOT NULL` constraint from `new_path`: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/200265 * [x] https://gitlab.com/gitlab-org/gitlab/-/issues/557745 Add support for `*_path` de-duplication in Rails * [x] Add custom getter in `MergeRequestDiffFile` to return `old_path` when `new_path` is `NULL` (https://gitlab.com/gitlab-org/gitlab/-/merge_requests/199501) * [ ] We should also look at `DiffFile#to_hash` (this is in `app/models/concern`). This method is called whenever we build the `Gitlab::Git::DiffCollection` based on `MergeRequestDiffFiles`. * [x] Update `*_path` values before they get to the DB * [x] Add `before_validation` to set `new_path = nil if old_path == new_path` * [x] Update `MergeRequestDiff#build_merge_request_diff_files` as we build the rows we bulk insert. (see `MergeRequestDiff#save_diffs`) ### Execute Backfill * [x] https://gitlab.com/gitlab-org/gitlab/-/issues/422767 Create background migration to move data from original table to partition * [x] insert `project_id` from `merge_request_diff.project_id` * [x] `new_path = nil if old_path == new_path` ### Backfill Cleanup + [ ] https://gitlab.com/gitlab-org/gitlab/-/issues/422769+ ### Use Partitioned Table + [ ] https://gitlab.com/gitlab-org/gitlab/-/issues/422771+ ### Remove Old Table & Supports + [no issue as of yet] ## Current status ```glql title: ⏳ Work in progress/up next (assigned + milestone) display: table sort: milestone asc fields: state, milestone, assignee, title, labels("workflow::*"), lastComment query: group = "gitlab-org" and epic = &11272 and state = opened and assignee != None and milestone != None ``` ```glql title: 🎉 Done display: table sort: closedAt desc fields: state, closedAt, milestone, assignee, title query: group = "gitlab-org" and epic = &11272 and state = closed ```
epic