LabKit: Artifact Registry Follow-up Work
## Participants
- @d.barrett
## LabKit v2: Shared Go Foundation for GitLab Services
LabKit v2 is a ground-up redesign of LabKit as a shared library for Go satellite services at GitLab. Built around a central `app.App` lifecycle object, it provides structured logging, distributed tracing, HTTP server/client, PostgreSQL, feature flags, and secret management.
[**LabKit v2 MR !314**](https://gitlab.com/gitlab-org/labkit/-/merge_requests/314) | [**go-service-template**](https://gitlab.com/gitlab-org/quality/go-service-template)
The core packages were validated through an [Artifact Registry PoC assessment](https://gitlab.com/gitlab-org/gitlab/-/issues/590332#note_3128897716) in which the AR team replicated the full local stack (LabKit v2, GOFF relay proxy, OTel Collector, Tempo, Loki, Grafana). The feedback below drives the phased delivery plan.
---
### Phase 1: Core Library _(Complete)_
Foundation packages validated by the Artifact Registry PoC. Ready for early adopters.
<table>
<tr>
<th>Package</th>
<th>Description</th>
<th>Tracking issue</th>
<th>Status</th>
</tr>
<tr>
<td>
`app`
</td>
<td>
Application lifecycle: ordered startup and reverse-ordered shutdown via the `Component` interface
</td>
<td>
[team-tasks#4286](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4286)
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`httpserver`
</td>
<td>
HTTP server with auto `/-/liveness` and `/-/readiness` probes, tracing and logging middleware
</td>
<td>
[team-tasks#4283](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4283)
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`httpclient`
</td>
<td>
Traced HTTP client that injects W3C `traceparent` headers into outgoing requests
</td>
<td>
[team-tasks#4282](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4282)
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`postgres`
</td>
<td>Traced PostgreSQL connection pool (pgxpool) with lifecycle management</td>
<td>
[team-tasks#4285](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4285)
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`secret`
</td>
<td>Secret retrieval from environment variables or k8s-style file mounts</td>
<td>
[team-tasks#4284](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4284)
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`trace`
</td>
<td>
Already in place - awaiting production validation
* https://gitlab.com/gitlab-org/labkit/-/work_items/98 - additional feedback from KAS team
</td>
<td>
[Tracking Epic](https://gitlab.com/groups/gitlab-org/quality/-/epics/359)
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`log`
</td>
<td>Already in place - ready for production</td>
<td>Done</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>
`metric`
</td>
<td>
Isolated Prometheus registry with GitLab-standard naming, labels, and SLO-aligned bucket sets. Reviewed by Observability team (Bob Van Landuyt).
* Shipped in [labkit v1.50.0](https://gitlab.com/gitlab-org/labkit/-/releases/v1.50.0) via !362
</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/work_items/4318
</td>
<td>
:white_check_mark: Done
</td>
</tr>
</table>
---
### Phase 2: Critical Fixes — Unblocks Feature Development _(High Priority)_
Identified as "needed now" in the Artifact Registry PoC feedback. These items affect tooling choices and production compatibility for all satellite services.
<table>
<tr>
<th>Item</th>
<th>Why it matters</th>
<th>Tracking issue</th>
<th>Status</th>
</tr>
<tr>
<td>
`database/sql` interface
</td>
<td>
Exposing `pgxpool` directly blocks most Go ORMs, migration tools (goose, Atlas, pgroll), and reuse of the Container Registry's DB load-balancing code. Switch to `database/sql` via `pgx/v5/stdlib` adapter.
</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4290
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>PgBouncer compatibility</td>
<td>
GitLab.com routes DB connections through PgBouncer in transaction pooling mode, breaking prepared statements. Expose `QueryExecModeSimpleProtocol` or document DSN workaround.
</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/work_items/4291
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>Database migrations framework</td>
<td>
The Artifact Registry needs a migration framework from day one. Evaluate goose/Atlas/pgroll/pgschema. Determine whether an abstraction belongs in LabKit v2.
This was tackled directly by the AR team in this issue: https://gitlab.com/gitlab-org/gitlab/-/work_items/592409#note_3148948077
</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/work_items/4292
</td>
<td>
:white_check_mark: Done -
</td>
</tr>
</table>
---
### Phase 3: Production Readiness — Needed Before Deployment _(Medium Priority)_
Required before satellite services can go to production.
<table>
<tr>
<th>Item</th>
<th>Why it matters</th>
<th>Tracking issue</th>
<th>Status</th>
</tr>
<tr>
<td>Graceful degradation on startup</td>
<td>
`app.Start` currently halts on any component failure. Services should be able to start in a degraded state and self-heal for non-critical dependencies (e.g. k8s sidecars that aren't ready yet).
</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4293
</td>
<td>
:new: New
</td>
</tr>
<tr>
<td>Router choice</td>
<td>
gorilla/mux is unmaintained. `httpserver` exposes the concrete `*mux.Router` type, making a future swap a breaking change for all consumers. Replace with chi or stdlib net/http 1.22+; expose `Router()` as an interface.
* https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/work_items/4311 - consensus established here
* [docs: add ADR 001 for LabKit HTTP routing approach](https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18825) - follow-up ADR to establish this pattern
</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4294
</td>
<td>
:white_check_mark: Done
</td>
</tr>
<tr>
<td>Design and implement feature flag client</td>
<td>OpenFeature client backed by GO Feature Flag relay proxy; each evaluation emits an OTel span</td>
<td>
[team-tasks#4281](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4281)
</td>
<td>
:new:
New
</td>
</tr>
<tr>
<td>Feature flags: operational tooling</td>
<td>The evaluation pipeline works. Before production rollout, define the flag source (ConfigMap vs Git/HTTP/S3 backend), management interface for toggling flags, and operational runbooks.</td>
<td>
[team-tasks#4288](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4288)
</td>
<td>
:new: New
</td>
</tr>
<tr>
<td>Production observability infrastructure</td>
<td>Confirm OTel Collector and trace backend (Tempo/Jaeger) provisioning for Runway-deployed satellite services. Validate log pipeline into Elasticsearch/Kibana.</td>
<td>
https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4295
</td>
<td>
:new: New
</td>
</tr>
</table>
---
### Phase 4: Extended Ecosystem — Needs a Timeline
Not blocking day one, but each item needs a decision and roadmap visibility.
| Item | Needed by | Description | Tracking issue | Status |
|------|-----------|-------------|----------------|--------|
| Secret rotation | Before staging deployment | `secret` package does a one-shot read at startup. Add live rotation support for Vault/OpenBao or k8s-managed credentials. | https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/work_items/4296 | :new: New |
| Cross-service tracing | Nice to have | Go LabKit v2 uses W3C `traceparent`; Ruby LabKit v1 uses Jaeger `uber-trace-id`. End-to-end traces between Go and Rails won't link without header bridging (OTel Collector can likely handle translation). | https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4297 | :new: New |
| Shared Go + Ruby feature flags | Before shared feature rollouts | Depends on a Ruby OpenFeature SDK with OFREP provider that doesn't exist yet. | https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4298 | :new: New |
| Background jobs framework | TBD | Evaluate asynq / River for LabKit v2 abstraction. | https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4299 | :new: New |
| Event bus | TBD | Direct API calls for MVP; NATS for future. Determine whether LabKit v2 should abstract queue topology and publish/subscribe. | https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4300 | :new: New |
| Database load balancing | After Phase 2 | Extract Container Registry's read/write splitting and replica routing code. Requires `database/sql` interface first (Phase 2). | https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4301 | :new: New |
---
### Adoption Tracking
| Service | Work | Issue | Status |
|---------|------|-------|--------|
| gitlab-shell | Migrate logging to labkit v2 log helper methods | [team-tasks#4287](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4287) | :eyes: In review |
| UAM/UAR | Investigate minimal feature flag solution | [team-tasks#4288](https://gitlab.com/gitlab-org/quality/quality-engineering/team-tasks/-/issues/4288) | :new: New |
---
### References
* [LabKit v2 MR !314](https://gitlab.com/gitlab-org/labkit/-/merge_requests/314)
* [go-service-template](https://gitlab.com/gitlab-org/quality/go-service-template)
* [Artifact Registry PoC Assessment](https://gitlab.com/gitlab-org/gitlab/-/issues/590332#note_3128897716) (@jdrpereira)
* [Artifact Registry tooling requirements spreadsheet](https://docs.google.com/spreadsheets/d/1RTdOEht4UzoHVdTCf_bD6ihIrea1_LxKoHwZVeZwdio/)
* [UAR/UAM Integration Notes](https://gitlab.com/groups/gitlab-org/quality/-/work_items/341#note_3089777608)
## Status
<!-- STATUS NOTE START -->
## Status 2026-03-12
:clock1: **total hours spent this week by all contributors**: 40
:tada: **achievements**:
- Phase 1 is effectively complete :tada: — 6 of 7 core packages shipped (`app`, `httpclient`, `secret`, `httpserver`, `postgres`, `log`), with only `v2/metrics` remaining. This gives any new modular feature a stable, production-ready foundation to build on from day one
- Phase 2 is also considered complete :tada: - the AR team is owning any decisions around database migration design. This could eventually lead to additional tooling being baked into LabKit, but this approach leaves us flexible and doesn't put us on the critical path.
- Significant polish invested across the v2 module this week: READMEs, pkg.go.dev `Example_` functions across 7 packages, postgres pgxpool exposure, deprecation cleanup, and @andrewn's protobuf-first config package — making LabKit v2 a genuinely great adoption experience rather than just functional scaffolding
- New GitLab services can now onboard to LabKit v2 with clear patterns, working examples, and best internal practices baked in — dramatically lowering the ramp-up time for teams building new modular features
:issue-blocked: **blockers**:
- Data replication ownership is being resolved — Alessio is opening conversations with the Data Engineering team to have them take ownership of this work
- Feature flag component path is unresolved — conversations kick off next week to decide on the best approach forward
:arrow_forward: **next**:
- Continue iteration on open `v2/metrics` MR to close out Phase 1 in full
- Begin Phase 3! :partyparrot:
_Copied from https://gitlab.com/groups/gitlab-org/quality/-/epics/360#note_3152605397_
<!-- STATUS NOTE END -->
epic