Skip to content

Ignore object pool already exists creation errors

James Fargher requested to merge fix_restore_object_pool into master

What does this MR do and why?

Object pools are not explicitly backed up. Each repository bundle file contains all the objects needed for that repo, including objects that were part of a pool. Then on restore, after the repositories are all restored, every object-pool is "scheduled". Scheduling does a few things:

  1. Runs a worker called ObjectPool::CreateWorker
  2. The worker tries to create the pool
  3. Gitaly explodes because the pool already exists Caused by GRPC::FailedPrecondition: 9:creating object pool: repository exists already.
  4. The worker fails and gets retried
  5. Because the pool is now marked as failed the worker deletes the pool from gitaly https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/object_pool/create_worker.rb
  6. The worker tries to create the pool again.

If the repository has had its objects deduplicated (moved from its repo into the pool repo) at any time before step 5, then these objects will be deleted with the pool. I think this used to work (despite the pool creation failure) because gitaly did not deduplicate as quickly as it does now, some time after step 5.

For the purposes of restoring, we only care that the pool repo exists. If it already exists, then our job is done. This is because we can rely on housekeeping to maintain the state of the pool repo now that the actual repos have all the objects they need.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

  1. Create a public project.
  2. Create a public fork of this project.
  3. Find the object pool and "schedule" it as per backups:
    pool = PoolRepository.last
    pool.state = 'none'
    pool.save
    pool.schedule
    Once the sidekiq jobs have completed. The repos of both projects should still show the list of files in the web UI.
Edited by James Fargher

Merge request reports