TODO shortlist before gstg upgrade
-
drop schema postgres_exporter
– it contains only wrapper function/views that will block upgrade:drop schema cascade postgres_exporter;
https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/19355 @bshah11 -
3 MRs are not yet merged https://gitlab.com/gitlab-com/gl-infra/db-migration/-/merge_requests?scope=all&state=opened&author_username=vitabaks -
chef to match ansible's changes https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/19116 -
pg_hba needs adjustment https://gitlab.slack.com/archives/C04PDKLJGUW/p1681749383481959 :party-exclaim: -
archive/delayed replicas, and gitlab-restore (all under omnibus control) – not ready for PG14. Workaround? https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16436 @NikolayS -
TODO items in the CR template – wip -
runbooks with polished texts – wip -
clean tests – planned for Monday (+Tuesday if needed) -
Switchover plan: are we ok with its mechanics? Won't be small lags a problem (e.g., when we switched over RO traffic, due to lags, app's balancers redirecting all traffic to old primary). Good news: PG14 has optimization for long tx; What are our app load balancer's thresholds for standby lags? Slack discussion to find it out: https://gitlab.slack.com/archives/C3NBYFJ6N/p1681918885397679 -
inventory file for gstg is not finalized yet – MR !380 (merged) :party-exclaim: -
ssh -A
is not reliable – CR template needs a fix (use local keys) -
misc CR template improvements -
gstg timing (T-3d, T-2d – should be -12h, etc?) -
T minus 1 day (2023-04-23 17:00 UTC)
- sublists have indentation issues with formatting (items are 1, 1, 1, not 1, 2, 3) -
we decided to remove autovacuum nice-to-have step (might decide to return it before gprd) -
tmux named session -- a snippet in CR? (e.g., tmux new -s pg14
) -
git clone
should use https, so keys are not required -
how tmp leave tmux (Ctl-b, Ctl-z) [ ] check DnD mode for all in the beginning + power cords are plugged, the laptop battery is far from 1%, etc.-
disable crons – disable, but not enable (and old directory). Now, we skip it. But danger is backup-push at 00:00 UTC – should manually control disable manually, do not allow it to happen; though, if it happens, we're probably fine too – worth elaborating) -
check that pg_hba is fine (local connections work, tmp line) // manual for now -
VERY IMPORTANT: check new backup location – should be empty // manual for now
-
-
pg_repack's old version is going to block pg_upgrade, we need to have the freshest version, with this fix -
gstg-registry inventory has issues !389 (comment 1367406340) -
ansible: more checks !411 (merged) -
check that test_replication doesn't exist -
check (or set?) restore_command to empty value on target -
check that data14 directory is empty -
check (or set, tmp-ly?) that local trust
is in pg_hba, on first place, temporarily -
check that postgres_exporter schema doesn't exist --- UPDATE: instead, let's have a snippet to run only pre-checks (being able to run them in advance) -
check pg_repack has v1.4.7 or higher
-
-
stop logical replication in the very end !411 (merged) -
ansible: fix database user \"gitlab-psql\" is not the install user
for the 'registy' cluster !411 (merged) -
check if this gitlab-org/gitlab#364370 (closed) can cause problems (when we add a postgres replica, app nodes might not see it, for up to 1h - or until their restart) -
test app code to be able to work with PG12 and PG14 at the same time (part of standbys are on one major PG version, some other standbys and/or primary – on another) -
BEFORE procedure: check who is working with Postgres, trace all DB and OS users -
AFTER procedure and before running backup-push – switch to new backup location again? (extra protection)
For GPRD Registry
-
DR Delayed and Archive replica infrastructure - @anganga -
autovacuum killer – do it with logging -
pgbouncer nodes – OS settings need adjustment - @anganga https://gitlab.slack.com/archives/C04PDKLJGUW/p1681837599275849 :party-exclaim: -
Rollback plan in the CR -
Define threshold: 30 minutes https
://gitlab.com/gitlab-com/gl-infra/db-migration/-/issues/12
-
-
Anything else the team would like to add
Edited by Biren Shah