Skip to content

Add check for unused replication slots

Adding a new check

Closes #127

This check looks in both the application and Geo tracking databases to see if there are any unused replication slots. If there is, the check shows a message to remove the unused replication slot.

Verification steps for review

To interpret the test results - the spot commands will show:

  • 0 if the check passes (no message is shown)
  • 1 if the check fails (message is shown)

Test notes:

  • Tests were performed on a Geo primary site and a Patroni cluster for the application database
  • This check will also run on Geo tracking databases (untested, but should work)

Test 1 - run on Geo primary site with no unused replication slots on the application database

# From Geo primary
❯ gitlab-psql -c "SELECT count(*) from pg_replication_slots where active = false"
 count
-------
     0
(1 row)

# From spot instance
❯ spot -p all_playbook.yml -n "20250609_3617 - run check" --dbg | grep "Run the check and store result" | grep '"message":' | wc -l
       0

Test 2 - run on Geo primary site with an unused replication slot on the application database

# From Geo primary
❯ gitlab-psql -c "SELECT pg_create_physical_replication_slot('test_slot')"
 pg_create_physical_replication_slot
-------------------------------------
 (test_slot,)
(1 row)

❯ gitlab-psql -c "SELECT count(*) from pg_replication_slots where active = false"
 count
-------
     1
(1 row)

# From spot instance
❯ spot -p all_playbook.yml -n "20250609_3617 - run check" --dbg | grep "Run the check and store result" | grep '"message":' | wc -l
       1

# From Geo primary
❯ gitlab-psql -c "SELECT pg_drop_replication_slot('test_slot')"
 pg_drop_replication_slot
--------------------------

(1 row)

Test 3 - run on Patroni leader with no unused replication slots on the application database

# From Patroni leader
❯ gitlab-psql -c "SELECT count(*) from pg_replication_slots where active = false"
 count
-------
     0
(1 row)

# From spot instance
❯ spot -p all_playbook.yml -n "20250609_3617 - run check" --dbg | grep "Run the check and store result" | grep '"message":' | wc -l
       0

Test 4 - run on Patroni leader with an unused replication slot on the application database

# From Patroni leader
❯ gitlab-psql -c "SELECT pg_create_physical_replication_slot('test_slot')"
 pg_create_physical_replication_slot
-------------------------------------
 (test_slot,)
(1 row)

❯ gitlab-psql -c "SELECT count(*) from pg_replication_slots where active = false"
 count
-------
     1
(1 row)

# From spot instance
❯ spot -p all_playbook.yml -n "20250609_3617 - run check" --dbg | grep "Run the check and store result" | grep '"message":' | wc -l
       1

# From Patroni leader
❯ gitlab-psql -c "SELECT pg_drop_replication_slot('test_slot')"
 pg_drop_replication_slot
--------------------------

(1 row)

Test 5 - run on Patroni replica on the application database

# From Patroni replica
❯ gitlab-psql -c "SELECT count(*) from pg_replication_slots where active = false"
 count
-------
     0
(1 row)

# From spot instance
❯ spot -p all_playbook.yml -n "20250609_3617 - run check" --dbg | grep "Run the check and store result" | grep '"message":' | wc -l
       0

Author checklist

  • After opening the MR:
    • Set it to the current milestone
    • Ask the Maintainer from the Reviewer roulette suggestion for review

Reviewer checklist

  • I followed the verification steps and confirm the functionality of the new check
    • I executed the check as presented in this MR by running the generated playbook with spot
    • In case of unexpected/odd behavior here, verify the generated playbook to account for potential YAML parsing issues
  • This check does only perform read operations
  • This check does not output more than necessary on stdout for the check to function
  • The message explains what it means when this check does not pass
  • The workaround_url provides actionable information/steps for affected users
    • Consider if a Knowledge Base article should exist to serve as the ideal workaround URL
  • This check is not using the Rails console/runner, or has Maintainer approval for doing so
  • If this is a breaking change check:
    • It has the corresponding xx_breaking_changes tag (xx being the major release version for the change)
    • The workaround_url goes to the entry on the https://docs.gitlab.com/update/deprecations/ page
    • The ref_url goes to the deprecation issue linked from that entry
    • The title is the same as that entry
    • The version_started is equal to the announcement_milestone of the deprecation
    • The version_fixed is equal to the removal_milestone of the deprecation
Edited by Anton Smith

Merge request reports

Loading