POC: GitLab Migration - export and import relations in batches

This is the POC for Export/import in batches during GitLab Migration (&9036 - closed) epic.

Tasks

  • Come up with possible implementation paths.

  • Check feasibility of implementation plan written by @georgekoltsov here:

  1. Create a new model / db table BulkImports::ExportBatch to be associated with BulkImports::Export (one to many)
  2. Each ExportBatch to have it's own export status & row range / offset / index to indicate which rows are contained in one batch
  3. Add optional flag to export_relations API to export in batches /api/v4/projects/123/export_relations?batch_export=true
  4. Whenever flag is provided, export in batches, otherwise fallback to the previous non-batched approach
  5. As far as batch export goes, relation export worker would have to, for each batch of rows (e.g. 1000 rows per), enqueue new RelationBatchExportWorker to perform the same things current RelationExportWorker does, but on a new set of records.
  6. The enqueuing of batch workers can cause race conditions in updating overall export status, so need to think how to make it reliable/not get stuck/ not updating status prematurely
Edited by Magdalena Frankiewicz