post-migration to partition ci_pipelines_config table fails with 'would overlap partition' error

Summary

PG::InvalidObjectDefinition: ERROR:  partition "ci_pipelines_config" would overlap partition "ci_pipelines_config_100"
LINE 3:   FOR VALUES IN (100, 101, 102);
part of stack trace
/srv/gitlab/db/post_migrate/20240903074926_partition_ci_pipelines_config.rb:13:in `block in up'
/srv/gitlab/lib/gitlab/database/with_lock_retries.rb:123:in `run_block'
/srv/gitlab/lib/gitlab/database/with_lock_retries.rb:134:in `block in run_block_with_lock_timeout'

This is a %17.4 migration introduced by Partition ci_pipelines_config table (!164455 - merged)

Steps to reproduce

  1. Upgrade from 17.3 to 17.5. Execute migrations separately from post-migrations.
  2. Post migrations will fail

Workaround

In GitLab %17.6 a migration will ship to replace ci_pipelines_config with the three partitions that are being created here. (Similarly, also for the metadata table)

Affected instances can therefore run with the three dynamically created partitions, and the %17.6 migration will handle this since the tables are created IF NOT EXISTS.

  1. Set the migration completed (a second migration will go on to fail, so both are set up)

      # ci_pipelines_config
    gitlab-rake gitlab:db:mark_migration_complete[20240903074926]
      # ci_build_trace_metadata
    gitlab-rake gitlab:db:mark_migration_complete[20240917143249]

    NOTE: Setting migrations complete (up) in this way isn't usually a valid workaround for upgrade / database migration issues.

    If you're reading this issue because you have a similar error for different tables or migrations, you should NOT set migrations up unless this is a documented workaround for your specific situation.

  2. Run remaining migrations

    gitlab-rake db:migrate

Example Project

What is the current bug behavior?

Migration appears to drop this table and then recreate it as a partition, but on affected instances, there's already a partition on this table, so it fails.

What is the expected correct behavior?

Migration works.

Relevant logs and/or screenshots

migration output
Running db:migrate rake task
main: == [advisory_lock_connection] object_id: 53320, pg_backend_pid: 3726985
main: == 20240903074926 PartitionCiPipelinesConfig: migrating =======================
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- quote_table_name(:p_ci_pipelines)
main:    -> 0.0000s
main: -- quote_table_name(:ci_pipelines_config)
main:    -> 0.0000s
main: -- execute("LOCK TABLE \"p_ci_pipelines\", \"ci_pipelines_config\" IN ACCESS EXCLUSIVE MODE")
main:    -> 0.0009s
main: -- drop_table(:ci_pipelines_config)
main:    -> 0.0021s
main: == [advisory_lock_connection] object_id: 53320, pg_backend_pid: 3726985

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

Edited by Ben Prescott (ex-GitLab)