Taking database backup snapshots fails when using PgPool
Summary
GitLab provides a backup rake task that will take a backup of the databases. This rake task is using PostgreSQL snapshots to 'freeze' the state of the databases, in order to have consistent database backups.
However, this will fail when using PgPool or another PostgreSQL proxy. For example, assume we have a database setup with 2 replica's, so we have database A, B and C
- The snapshot could be set on database A
- The backup could be run on database C
So we call the backup using snapshot from database A, which does not exists on database C and the backup will fail.
Possible solutions
- Detect we are not using direct connection and let the backup fail with a descriptive error message
- Pros: This is in line with our documentation (use direct database connection and no proxy)
- Cons: It can break current backup procedures
- Detect we are not using direct connection and let the backup continue with a descriptive warning message. Ignore the snapshots
- Pros: It encourages users to update their backup procedure without breaking it
- Cons: It goes against our recommendation (use direct connection)
- Detect single database and do not take snapshots for that
- Pros: simple, most (if not all) customers are on single DB
- Cons: None
- There is a draft MR for this one
Proposed solution
Detect single database mode and skip or ignore snapshots.
We can revisit the decomposed database issue when we get there,
Current impact
Customers on multiple connections that use PostgreSQL proxy servers (PgPool II, PgBouncer and others) can not use our database script right now.
Since we do not have such customers yet, impact is low. But when self-managed customers start using multiple databases, this will be an issue.