Skip to content

Partially revert "Optimise the external diff storage migration query"

Nick Thomas requested to merge partial-revert-schedule-merge-request-diffs into master

What does this MR do?

This is a conceptual revert of !38579 (merged)

The change in the previous MR moved us from gathering 1,000 diffs to migrate via single large query (which scales to ~15 million rows) to scanning through the table in multiple queries (which we hoped would scale to the 72 million rows we actually have).

However, it's become apparent while running this in gprd that skipping over rows that are not stored externally, and which don't have any merge_request_diff_files rows, causes this to run too slowly to be useful, particularly given we always scan in order.

This revert keeps some cosmetic code reordering changes, but returns us to the previous strategy of performing one big query to get the 1000 rows. This will prevent us from shipping the change while we work out what the next steps should be.

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Merge request reports