Skip to content

Rely on disk_path for PoolRepository identification

Vasilii Iakliushin requested to merge 424194_use_disk_path_for_unique_check into master

What does this MR do and why?

Contributes to https://gitlab.com/gitlab-org/gitlab/-/issues/424194

Problem

PoolRepository can be missing a source_project link when the root project was deleted. That blocks the pool repository migration from shard to shard.

Solution

  • Remove prensence validation for source_project (as it is a valid case for pool repositories flow.)
  • Use a combination of disk_path + shard_name to identify a pool repository.

How to set up and validate locally

  1. Enable feature flag Feature.enable(:replicate_object_pool_on_move)
  2. Configure multiple Gitaly storages. It is possible to do it via GDK: https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/gitaly.md#add-gitaly-nodes
  3. Create a public project, then fork the project in order to create pool_repository
  4. You should see a new record in pool_repositories table (PoolRepository.last should have a source_project link to the original project)
  5. You should see a new folder for the object pool in your GDK (path to pool folder repositories/@cluster/pools/)
  6. Call an API request to move a repository to a different storage: https://docs.gitlab.com/ee/api/project_repository_storage_moves.html#schedule-a-repository-storage-move-for-a-project
  7. You should see a new record in pool_repositories table (a new PoolRepository should still point to the same source_project, but it should have a different shard value)
  8. You should see a new folder for the object pool in your GDK (path to pool folder repository_storages/praefect-gitaly-1/praefect-internal-1/@cluster/pools/)
  9. Verify that object pool is replicated as well. Content of related object pools (in repositories/@cluster/pools/ and repository_storages/praefect-gitaly-1/praefect-internal-1/@cluster/pools/) should be the same.
  10. Repeat the migration process for the forked project
  11. Original PoolRepository record should disappear, because all projects were moved to the new shard
  12. Delete the root project
  13. PoolRepository record should have source_project: nil value after that.
  14. Call an API request to move a fork project repository to a different storage
  15. A new PoolRepository record should appear. It should have the same disk_path value as before, source_project: nil and point to the new shard name.
  16. The old PoolRepository record should be removed, because there are 0 projects linked to it.

Database

Before

EXPLAIN SELECT "pool_repositories".* FROM "pool_repositories" INNER JOIN "shards" ON "shards"."id" = "pool_repositories"."shard_id" WHERE "pool_repositories"."source_project_id" = 2 AND "shards"."name" = 'default';

Nested Loop  (cost=0.56..6.61 rows=1 width=100) (actual time=5.596..5.597 rows=0 loops=1)  
...

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/22253/commands/71940

After

EXPLAIN SELECT "pool_repositories".* FROM "pool_repositories" INNER JOIN "shards" ON "shards"."id" = "pool_repositories"."shard_id" WHERE "pool_repositories"."disk_path" = '@pools/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b' AND "shards"."name" = 'default';

Nested Loop  (cost=0.56..6.61 rows=1 width=100) (actual time=8.046..8.047 rows=0 loops=1)
...

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/22253/commands/71941

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Vasilii Iakliushin

Merge request reports