Skip to content

Refactor add populate commit permission migration

What does this MR do and why?

See the issue for screenshots/log information and background.

Related to #365249 (closed) and #344459 (closed)

Issue

The migration is supposed to start at visibility_level = 0 and repository_access_level = 0 and move through all permutations of available options. Each permutation will update all commits which are missing those fields for project with those visibility levels using batches. The Elasticsearch & Kibana logs showed that each successful batch was updating 200,000 documents, but never moved past the first permutation (visibility_level = 0 and repository_access_level = 0). The migration calculator worksheet showed 270,000 commit documents were associated through projects with visibility_level = 0 and repository_access_level = 0. The original query used to update commit documents was not limiting to the commits missing those fields, probably updating the nearly the same 200,000 documents each time but never moving onto the next permutation.

Fix

This MR:

  • Updates the logic used when querying Elasticsearch documents to update. The new query searches for commit documents which are missing the fields being added (since both project_visibility and repository_access_level are updated the same time, we only check for one being missing).
  • adds a new migration state parameter: documents_remaining_for_permutation to help track completion of each permutation via logging
  • moves a few methods into private methods
  • reduces the batch_size to 100,000. 200,000 may have been too high production has a lot more searching/indexing vs. staging. See the updated calculation spreadsheet (internal link)). There are 471,774,692 documents left to update and the initial updates were taking 10-30 seconds on production (seen in Kibana logs). The migration should complete in 236 hours.
  • refactors existing spec
    • use a the same small batch size (10) for each migration
    • run the migration for each batch for each permutation of project_visibility and repository_access_level

Screenshots or screen recordings

N/A

How to set up and validate locally

This MR is difficult to test locally. I have validated that the specs fail when the changes to the migration are rolled back.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports