Skip to content

Fix deadlock in backup repositories rake task

James Fargher requested to merge fix_concurrent_backup into master

What does this MR do?

Fixes #241701 (closed)

The deadlock is between one thread in Repository#dump_storage trying to load projects and another thread waiting forever on a queue. This MR wraps all potentially blocking calls with a permit_concurrent_loads block. SizedQueue was wrapped as while queue.pop became unacceptably hard to read.

This now runs smoothly in a production like demo environment:

root@jfargher-demo-gitlab:~# gitlab-backup create GITLAB_BACKUP_MAX_CONCURRENCY=2 GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY=1
2020-09-02 03:34:37 +0000 -- Dumping database ... 
Dumping PostgreSQL database gitlabhq_production ... [DONE]
2020-09-02 03:34:39 +0000 -- done
2020-09-02 03:34:39 +0000 -- Dumping repositories ...
 * root/gitaly4 (@hashed/ef/2d/ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d) ... 
 * gitlab-instance-administrators-2af70bc3/gitlab-self-monitoring (@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b) ... 
 * root/gitaly4 (@hashed/ef/2d/ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d) ... [DONE]
 * gitlab-instance-administrators-2af70bc3/gitlab-self-monitoring (@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b) ... [SKIPPED]
 * root/gitaly4 (@hashed/ef/2d/ef2d127de37b942baad06145e54b0c619a1f22327b2ebbcfbec78f5564afe39d) ... [SKIPPED] Wiki
 * gitlab-instance-administrators-2af70bc3/gitlab-self-monitoring (@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b) ... [SKIPPED] Wiki
 * root/gitaly1 (@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35) ... 
 * root/gitaly1 (@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35) ... [DONE]
 * root/gitaly1 (@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35) ... [SKIPPED] Wiki
 * root/gitaly2 (@hashed/4e/07/4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce) ... 
 * root/gitaly2 (@hashed/4e/07/4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce) ... [DONE]
 * root/gitaly2 (@hashed/4e/07/4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce) ... [SKIPPED] Wiki
 * root/gitaly3 (@hashed/4b/22/4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a) ... 
 * root/gitaly3 (@hashed/4b/22/4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a) ... [DONE]
 * root/gitaly3 (@hashed/4b/22/4b227777d4dd1fc61c6f884f48641d02b4d121d3fd328cb08b5531fcacdabf8a) ... [SKIPPED] Wiki
2020-09-02 03:34:43 +0000 -- done
2020-09-02 03:34:43 +0000 -- Dumping uploads ... 
2020-09-02 03:34:43 +0000 -- done
2020-09-02 03:34:43 +0000 -- Dumping builds ... 
2020-09-02 03:34:43 +0000 -- done
2020-09-02 03:34:43 +0000 -- Dumping artifacts ... 
2020-09-02 03:34:43 +0000 -- done
2020-09-02 03:34:43 +0000 -- Dumping pages ... 
2020-09-02 03:34:43 +0000 -- done
2020-09-02 03:34:43 +0000 -- Dumping lfs objects ... 
2020-09-02 03:34:43 +0000 -- done
2020-09-02 03:34:43 +0000 -- Dumping container registry images ... 
2020-09-02 03:34:43 +0000 -- [DISABLED]
Creating backup archive: 1599017683_2020_09_02_13.4.0-pre_gitlab_backup.tar ... done
Uploading backup archive to remote storage  ... skipped
Deleting tmp directories ... done
done
done
done
done
done
done
done
Deleting old backups ... skipping
Warning: Your gitlab.rb and gitlab-secrets.json files contain sensitive data 
and are not included in this backup. You will need these files to restore a backup.
Please back them up manually.
Backup task is done.

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Saikat Sarkar

Merge request reports