Partially revert "Optimise the external diff storage migration query"
What does this MR do?
This is a conceptual revert of !38579 (merged)
The change in the previous MR moved us from gathering 1,000 diffs to migrate via single large query (which scales to ~15 million rows) to scanning through the table in multiple queries (which we hoped would scale to the 72 million rows we actually have).
However, it's become apparent while running this in gprd that skipping over rows that are not stored externally, and which don't have any merge_request_diff_files
rows, causes this to run too slowly to be useful, particularly given we always scan in order.
This revert keeps some cosmetic code reordering changes, but returns us to the previous strategy of performing one big query to get the 1000 rows. This will prevent us from shipping the change while we work out what the next steps should be.
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team