Bug in PostgreSQL causes integrity issues relating to foreign keys on partitioned tables

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Summary

A bug was fixed in PostgreSQL 16.5, 14.14, 15.9, 13.17 (and later) relating to FK constraints and partitioned tables.

Bug detail

Fix updates of catalog state for foreign-key constraints when attaching or detaching table partitions

If the referenced table is partitioned, then different catalog entries are needed for a referencing table that is stand-alone versus one that is a partition. ATTACH/DETACH PARTITION commands failed to perform this conversion correctly. In particular, after DETACH the now stand-alone table would be missing foreign-key enforcement triggers, which could result in the table later containing rows that fail the foreign-key constraint. A subsequent re-ATTACH could fail with surprising errors, too.

The way to fix this is to do ALTER TABLE DROP CONSTRAINT on the now stand-alone table for each faulty constraint, and then re-add the constraint. If re-adding the constraint fails, then some erroneous data has crept in. You will need to manually re-establish consistency between the referencing and referenced tables, then re-add the constraint.

This query can be used to identify broken constraints and construct the commands needed to recreate them:

SELECT conrelid::pg_catalog.regclass AS "constrained table",
       conname AS constraint,
       confrelid::pg_catalog.regclass AS "references",
       pg_catalog.format('ALTER TABLE %s DROP CONSTRAINT %I;',
                         conrelid::pg_catalog.regclass, conname) AS "drop",
       pg_catalog.format('ALTER TABLE %s ADD CONSTRAINT %I %s;',
                         conrelid::pg_catalog.regclass, conname,
                         pg_catalog.pg_get_constraintdef(oid)) AS "add"
FROM pg_catalog.pg_constraint c
WHERE contype = 'f' AND conparentid = 0 AND
   (SELECT count(*) FROM pg_catalog.pg_constraint c2
    WHERE c2.conparentid = c.oid) <>
   (SELECT count(*) FROM pg_catalog.pg_inherits i
    WHERE (i.inhparent = c.conrelid OR i.inhparent = c.confrelid) AND
      EXISTS (SELECT 1 FROM pg_catalog.pg_partitioned_table
              WHERE partrelid = i.inhparent));

Since it is possible that one or more of the ADD CONSTRAINT steps will fail, you should save the query's output in a file and then attempt to perform each step.

We have identified that GitLab instances are affected by this bug:

  • The issue arises when both tables in a foreign key relationship are partitioned.
  • Consequence: ON DELETE CASCADE does not occur on affected tables. (possibly also other operations like ON DELETE SET NULL - we need to check)
  • In the case of DELETE CASCADE records will persist on tables after the record they refer to has been deleted, despite there being a constraint that specifies that these records should not (and do not) exist.

This is because the system triggers that effect the DELETE operation are not present, owing the the bug.

Related records:

Steps to reproduce

See: #498868 (comment 2236226541)

Example Project

What is the current bug behavior?

ON DELETE CASCADE does not work for some partitioned tables.

Affected instances will contain tables with

  • records that need to be deleted
  • constraints that need to be recreated

(Additionally, they'll need to upgrade to a fixed PostgreSQL version - that isn't in scope for this issue!)

What is the expected correct behavior?

ON DELETE CASCADE works where implemented: FK constraints are both complete and genuinely valid.

Relevant logs and/or screenshots

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

Edited by 🤖 GitLab Bot 🤖