Table bloat on the GitLab.com patroni-registry service
As discussed in the Engineering Allocation call: https://docs.google.com/document/d/1j_9P8QlvaFO-XFoZTKZQsLUpm1wA2Vyf_Y83-9lX9tg/edit#bookmark=id.vvrawsf6dgv4
We're seeing able bloat increasing fairly steadily on the patroni-registry postgres instance.
Real time monitoring:
- Btree bloat: https://dashboards.gitlab.net/d/alerts-sat_pg_btree_bloat/alerts-pg_btree_bloat-saturation-detail?orgId=1&var-PROMETHEUS_DS=Global&var-environment=gprd&var-type=patroni-registry&var-stage=main
- Table bloat: https://dashboards.gitlab.net/d/alerts-sat_pg_table_bloat/alerts-pg_table_bloat-saturation-detail?from=now-6h%2Fm&to=now%2Fm&var-PROMETHEUS_DS=Global&var-environment=gprd&var-type=patroni-registry&var-stage=main&var-component=pg_table_bloat&orgId=1
We are doing a migration to this service at present, so this bloat may be a known issue, but opening an issue to track the status of this.
@sgoldstein @michelletorres @trizzi
Useful resources/links
-
Queries used by the API: https://gitlab.com/gitlab-org/container-registry/-/blob/master/docs-gitlab/db/http-api-queries.md
-
Online GC spec (including the definition of triggers): https://gitlab.com/gitlab-org/container-registry/-/blob/master/docs-gitlab/db/online-garbage-collection.md
-
ER model: https://gitlab.com/gitlab-org/container-registry/-/blob/master/docs-gitlab/db/er_model.png
-
Service dashboard: https://dashboards.gitlab.net/d/registry-main/registry-overview?orgId=1. There is a
database Service Level Indicator Detail
row where we can see the rate and latency of all queries, both from GC and the API (each query has a unique name). -
Online GC dashboard: https://dashboards.gitlab.net/d/registry-gc/registry-garbage-collection-detail?orgId=1
Status
2021-12-30
Table and index bloat remain stable since the last update, now at 6% and 38%, respectively (source).