Artifact Registry: Provision PostgreSQL-compatible database across environments
Context
Artifact Registry (AR) is a new modular service to be deployed on Runway GKE, providing organization-level unified artifact management. It will offer a superset of the functionality currently provided by Container Registry, Package Registry, and Virtual Registry.
AR is a monetized GitLab feature that sits on the customer's critical path to production. Container images, packages, and dependencies served by AR are consumed by CI/CD pipelines, deployment workflows, and developer tooling. Downtime or degraded performance directly blocks customers from building, testing, and shipping software. Availability, performance, and reliability of the underlying infrastructure are paramount.
AR requires a dedicated PostgreSQL-compatible database, separate from the GitLab monolith, with its own independent schema lifecycle. The full set of AR infrastructure requirements (covering all components, not just the database) is documented in the infrastructure contract. The database-specific requirements are extracted below.
For more on the overall architecture, see the architecture design document.
Timeline
AR is targeting .com go-live before the end of Q2 FY27 (July 31, 2026). The database needs to be available across environments (staging and production) by June 15, 2026, to allow ~6 weeks for integration testing, staging validation, and progressive rollout.
Requirements
One AR application connects to exactly one PostgreSQL-compatible database. AR uses UUID v7 primary keys generated on the application side.
The database is heavily partitioned. The proposed schema defines ~45 logical tables for the MVP (Container, Maven, and npm formats), most hash-partitioned with 64 partitions each, resulting in over 1,200 physical PostgreSQL tables. Each additional artifact format will add more tables.
Isolation
| Requirement | Level | Detail |
|---|---|---|
| Separate logical database | MUST | Own schema, migrations, and credentials. Other modules MUST NOT have cross-database access. |
| Separate physical instance | SHOULD | See rationale below. |
AR MUST NOT be required to shard across multiple databases at the application level. If the infrastructure layer provides multiple physical backends behind a single logical endpoint (e.g., proxy, connection pooler), that is transparent to AR.
AR MUST NOT be required to provide any replication, extraction, or migration logic to move its data from a shared server to a dedicated one. The module connects to an endpoint; how data is moved behind that endpoint is an infrastructure concern.
Why a separate physical instance is recommended
| Risk | Detail |
|---|---|
| Resource contention | AR's background operations (GC, lifecycle enforcement, storage accounting self-healing) involve long-running transactions and table scans that compete for I/O, CPU, and memory. |
| Noisy neighbor | A long-running AR migration or GC sweep can degrade co-located modules, and vice versa. |
| Scaling independence | AR's data grows with artifact metadata volume, which scales differently from other modules. Shared backends make independent scaling harder. |
| Migration risk | Starting shared and extracting later requires data migration under load, significantly more expensive and risky than provisioning separately from the start. |
Connection pooling
A connection pooler MAY be placed in front of the database. AR is designed to be compatible with transaction-mode connection poolers (e.g., PgBouncer), following the same pattern as the GitLab Container Registry on .com. AR uses transaction-level advisory locks (pg_advisory_xact_lock) and disables prepared statements by default in favor of simple query protocol.
Read replicas
Not required for the MVP, but AR will need access to read-only replicas for database load balancing post-MVP. This SHOULD be considered when provisioning to avoid a later migration.
Backup and disaster recovery
Database backups are an infrastructure concern. AR MUST NOT be required to implement its own backup or recovery mechanisms.
Observability
AR emits metrics, logs, and traces via LabKit. AR MUST have per-module dashboards, alerting, and SLIs/SLOs tracked independently from co-located modules.
Compatibility
| Requirement | Detail |
|---|---|
| PostgreSQL-compatible | Cloud SQL, AWS RDS, or a custom PostgreSQL cluster are all viable options. |
| Connection pooling | AR is compatible with transaction-mode connection poolers. A pooler is not required but MAY be used. |
| Logical database isolation | Own schema, migrations, credentials. |
| Read-only replicas | Supported (post-MVP, for load balancing). |
| DbLab support | Required. |
| Runway-on-GKE connectivity | The AR application runs on Runway-on-GKE on .com. The database must be reachable from there with low latency. |
Capacity planning
Current .com baseline (Container Registry only, the only component with its own database today):
| Metric | Steady (7d avg) | Peak (7d max) |
|---|---|---|
| Database ops | ~3,800 ops/s | ~6,400 ops/s |
| Database size | ~2.8 TB |
Package Registry and Dependency Proxy currently use the monolith database, so their DB ops are not isolated. Once AR absorbs all components, DB ops will be higher than the CR numbers shown.
AR starts from zero: no data and no users. It will grow into these numbers over time as it absorbs all artifact-related workloads.
Specific VM/node sizing, replica counts, and leader configuration are to be determined by the owning team.
Open questions
Two open questions remain. Each has a dedicated discussion thread:
- PG 18 and UUID v7 compatibility: whether PG 18 is required or recommended, and the implications for UUID v7.
- Dedicated vs. shared cluster: whether AR should get its own physical cluster or share an existing one.
Prior discussion
This issue consolidates the database-related discussion from:
- Infrastructure contract doc comments (Mar 30 - Apr 9)
- Database Excellence weekly thread (Apr 8 - Apr 9)
References
- AR architecture design document
- AR database schema (ADR-007)
- AR infrastructure contract
- Artifact Registry epic
- Parent epic: Provision required infrastructure components
- Runway x Artifact Registry sync
- IMR blueprint
Related to gitlab-org/gitlab#591832 (closed)