Database migration failing from 13.3.9 to 13.6.2 after server migration
Summary
Using Omnibus, database migration is failing during upgrade from 13.3.9 to 13.6.2.
Steps to reproduce
Here's what I did to end up in this situation:
- Do a full backup of a running GitLab 13.3.9 on Debian Jessie.
- Prepare a new GitLab server based on Debian Buster.
- At this point, I think I might have inadvertently installed the (then) current version 13.5.3 of GitLab instead of 13.3.9 immediately. I rolled back to 13.3.9 following the official instructions before restoring the backup though.
- Restore the backup from the old server.
- ... a couple of weeks go by without incidents ...
- Try to upgrade the omnibus installation to the latest GitLab 13.6.2.
What is the current bug behavior?
The database schema update within the Omnibus upgrade is failing with:
Concrete error stack
== 20200920130356 AddContainerExpirationPolicyWorkerSettingsToApplicationSettings: migrating -- column_exists?(:application_settings, :container_registry_expiration_policies_worker_capacity) -> 0.0370s -- add_column(:application_settings, :container_registry_expiration_policies_worker_capacity, :integer, {:default=>0, :null=>false})rake aborted! StandardError: An error has occurred, this and all later migrations canceled:PG::DuplicateTable: ERROR: relation "postgres_indexes" already exists /opt/gitlab/embedded/service/gitlab-rails/db/migrate/20200922093004_add_postgres_index_view.rb:7:in
up' /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:59:in
block (3 levels) in <top (required)>' /opt/gitlab/embedded/bin/bundle:23:inload' /opt/gitlab/embedded/bin/bundle:23:in
'Caused by: ActiveRecord::StatementInvalid: PG::DuplicateTable: ERROR: relation "postgres_indexes" already exists /opt/gitlab/embedded/service/gitlab-rails/db/migrate/20200922093004_add_postgres_index_view.rb:7:in
up' /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:59:in
block (3 levels) in <top (required)>' /opt/gitlab/embedded/bin/bundle:23:inload' /opt/gitlab/embedded/bin/bundle:23:in
'Caused by: PG::DuplicateTable: ERROR: relation "postgres_indexes" already exists /opt/gitlab/embedded/service/gitlab-rails/db/migrate/20200922093004_add_postgres_index_view.rb:7:in
up' /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:59:in
block (3 levels) in <top (required)>' /opt/gitlab/embedded/bin/bundle:23:inload' /opt/gitlab/embedded/bin/bundle:23:in
' Tasks: TOP => db:migrate (See full trace by running task with --trace)
See the full upgrade log: https://drive.google.com/file/d/1dSPIzVbTBPcMzFxuHQk1WCNvBCn6bh6-/view?usp=sharing
What is the expected correct behavior?
The Omnibus upgrade scripts should handle whatever caused this situation.
Relevant logs and/or screenshots
The full upgrade log: https://drive.google.com/file/d/1dSPIzVbTBPcMzFxuHQk1WCNvBCn6bh6-/view?usp=sharing
Output of checks
(Since this is our production instance, I had to restore GitLab 13.3.9 using our latest backups and cannot interrupt the service currently. So I currently cannot "break" it in order to run checks. Please see "Possible fixes" below though.)
Possible fixes
I searched for this particular issue and came across #258580 (closed). Since datewise it seems to fit between 13.3.9 and 13.6.2, here's what might have happened:
- Backup created with v13.3.9 on old server
- Everything went well.
- Current GitLab Version inadvertently installed on new server
- All database migrations (including creation of the view) have been executed
- GitLab Version on new server rolled back to 13.3.9
- All tables were deleted / recreated but as stated in #258580 (closed), the views were left in place.
- Restore of 13.3.9 backup onto new server
- Successful as the "rogue" view did no impact
- Attempt to upgrade new server to 13.6.2
- Fails since the view already exists and the migration attempts to create it
After rolling back to 13.3.9 and restoring my last backup, I checked for the existence of the view and found it on the public
schema:
gitlabhq_production=# select table_name, table_schema from INFORMATION_SCHEMA.views WHERE table_schema = ANY (current_schemas(false));
table_name | table_schema
------------------+--------------
postgres_indexes | public
(1 row)
Edit
I managed to reproduce this exact issue and validate my guess about what happened in a test VM following the steps above. In the VM, I managed to solve the issue by first dropping the view before attempting the upgrade to the latest GitLab version like so:
In GitLab 13.3.*:
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql -d gitlabhq_production
DROP VIEW public.postgres_indexes;
Then upgrade to the latest GitLab:
apt-get update && apt-get upgrade
I guess it's your decision whether you need / want to include anything in your migration scripts to catch this situation.