Upgrade to 18.0: No such column: geo_nodes.verification_max_capacity
Problem
Upgrade to 18.0.0 from 17.11.2 failed on a migration:
No such column: geo_nodes.verification_max_capacity
/opt/gitlab/embedded/service/gitlab-rails/db/migrate/20250312124044_change_geo_concurrency_default_settings.rb:8:in `up'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:33:in `block in exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/query_analyzer.rb:83:in `within'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/restrict_gitlab_schema.rb:30:in `exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migration_helpers/automatic_lock_writes_on_tables.rb:21:in `exec_migration'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:123:in `run_block'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:134:in `block in run_block_with_lock_timeout'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `public_send'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:127:in `block in write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:141:in `block in read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:228:in `retry_with_backoff'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/load_balancer.rb:130:in `read_write'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:126:in `write_using_load_balancer'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/load_balancing/connection_proxy.rb:78:in `transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:129:in `run_block_with_lock_timeout'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/with_lock_retries.rb:97:in `run'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migrations/lock_retry_mixin.rb:52:in `ddl_transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migrations/runner_backoff/active_record_mixin.rb:21:in `execute_migration_in_transaction'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database/migrations/pg_backend_pid.rb:14:in `with_advisory_lock'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:189:in `configure_database'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:156:in `configure_pg_databases'
/opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/db.rake:102:in `block (3 levels) in <top (required)>'
/opt/gitlab/embedded/bin/bundle:25:in `load'
/opt/gitlab/embedded/bin/bundle:25:in `<main>'
I want to upgrade to 18.0. Will I be affected?
If ever ran GitLab EE years ago, and then switched that environment to running GitLab CE/FOSS, then you may be affected. If you did not do this, then you are likely not affected.
You can confirm this by doing the following:
- Output the differences between your database schema and what the GitLab application expects the database schema to look like:
sudo gitlab-rake gitlab:db:schema_checker:run > schema_drift.txt
- See if your output contains: "The table geo_nodes has columns missing from the database":
grep "The table geo_nodes has columns missing from the database" schema_drift.txt
If that line is found, then you are affected by missing columns in the geo_nodes
table. In that case, you should not upgrade to GitLab 18.0.0 or 18.0.1. You should wait for the fixes in this issue to be released and then upgrade to those versions.
Workarounds
Workaround to add missing columns verification_max_capacity
and minimum_reverification_interval
in sudo gitlab-psql
:
ALTER TABLE geo_nodes ADD COLUMN verification_max_capacity integer default 10, ADD COLUMN minimum_reverification_interval integer default 90;
Workaround to add only one missing column minimum_reverification_interval
in sudo gitlab-psql
:
ALTER TABLE geo_nodes ADD COLUMN minimum_reverification_interval integer default 90;
Possible fix
I think the proper codebase fix is to add another migration ensuring that geo_nodes
table has the right schema. We should at least make the migration add columns that are missing. But modifying columns depending on the particular schema drift can get complex. So I would not attempt that in this issue.
We must ensure that it gets into a patch release in the same version that 20250312124044_change_geo_concurrency_default_settings
landed in => v18.0.
Implementation Plan
-
Install GDK (you don't need Geo or even a license)
-
Simulate the issue:
gdk psql -c "ALTER TABLE geo_nodes DROP COLUMN verification_max_capacity, DROP COLUMN minimum_reverification_interval;"
-
Create a Rails migration which adds those columns if they don't exist:
# Example migration structure class AddMissingGeoNodesColumns < Gitlab::Database::Migration[2.2] milestone '18.0' def up add_column :geo_nodes, :verification_max_capacity, :integer, default: 10 unless column_exists?(:geo_nodes, :verification_max_capacity) add_column :geo_nodes, :minimum_reverification_interval, :integer, default: 90 unless column_exists?(:geo_nodes, :minimum_reverification_interval) end def down # No need to remove columns in down migration end end
-
Run migrations:
gdk migrate
-
Verify the fix:
gdk psql -c "SELECT column_name, data_type, column_default FROM information_schema.columns WHERE table_name = 'geo_nodes' AND column_name IN ('verification_max_capacity', 'minimum_reverification_interval')"
-
Test that GitLab works by starting GDK and verifying no errors in the logs
-
Submit an MR
-
After it is approved will be merged soon, backport the fix to 18.0
Backporting the Fix to 18.0
Here are the detailed steps to backport the fix to the 18.0-stable-ee branch:
1. Create a new branch from 18.0-stable-ee
# Fetch the latest changes
git fetch origin
# Create a new branch based on 18.0-stable-ee
git checkout -b fix-geo-nodes-columns-18-0-backport origin/18-0-stable-ee
2. Cherry-pick the commit
# Replace COMMIT_HASH with the actual commit hash of the fix
git cherry-pick COMMIT_HASH
If there are conflicts during the cherry-pick:
# Resolve conflicts manually in your editor
# After resolving conflicts
git add .
git cherry-pick --continue
If you are cherry-picking the merge commit on master, then use -m 1
:
# Replace COMMIT_HASH with the actual commit hash of the fix
git cherry-pick -m 1 COMMIT_HASH
3. Create a merge request
-
Push your branch:
git push -u origin fix-geo-nodes-columns-18-0-backport
-
Create a merge request on GitLab:
- Target branch:
18-0-stable-ee
- Title: "Backport: Fix geo_nodes migration to handle missing columns"
- Description: Include a reference to the original issue (#543146 (closed)) and explain that this is a backport to fix upgrade issues in 18.0
- Target branch:
4. Special considerations for backporting
- In the MR description, clearly state this is a backport to fix a critical upgrade issue
- Add the following labels to your MR (or try
/copy_metadata #543146
):backport
type::bug
-
severity::2
(as in the original issue) group::geo
- Explain in the MR description that this fix is needed to prevent upgrade failures for users upgrading to 18.0 who have the geo_nodes table but are missing these specific columns due to schema drift
To see how your DB schema differs from what the GitLab application expects the DB to look like
# Run schema checker and save full output
sudo gitlab-rake gitlab:db:schema_checker:run > schema_drift.txt
Optionally, filter out integer to bigint conversions if there are very many of them, in order to make it easier to view other differences.
# Filter out bigint conversions and save to another file
awk 'BEGIN{RS="------------------------------------------------------\n"; ORS=""}
{
if ($0 !~ /convert_to_bigint|integer.*bigint|namespace_descendants.*DEFAULT.*::bigint/ &&
$0 !~ /Diff:[ \n]*$/ &&
NF > 0) {
print $0 RS;
}
}' schema_drift.txt > schema_drift_no_bigint_conversions.txt