Skip to content

Geo: Improve error messages when Geo migrations fail

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

When db:migrate:geo fails, it can fail for a few different reasons, but the error output isn't particularly useful. It would be good to improve the error message to make it easier to debug and fix.

 - gitlab-rake
  - db:migrate:geo
  delta: '0:00:29.532254'
  end: '2024-09-23 12:59:21.604470'
  msg: non-zero return code
  rc: 1
  start: '2024-09-23 12:58:52.072216'
  stderr: |-
    rake aborted!
    ActiveRecord::StatementInvalid: PG::ReadOnlySqlTransaction: ERROR:  cannot execute INSERT in a read-only transaction
    /opt/gitlab/embedded/service/gitlab-rails/app/models/application_record.rb:90:in `block in safe_find_or_create_by'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/cross_database_modification.rb:92:in `block in transaction'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database.rb:378:in `block in transaction'
    /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/database.rb:377:in `transaction'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/concerns/cross_database_modification.rb:83:in `transaction'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/application_record.rb:90:in `safe_find_or_create_by'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/shard.rb:21:in `by_name'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/shard.rb:17:in `block in populate!'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/shard.rb:17:in `map'
    /opt/gitlab/embedded/service/gitlab-rails/app/models/shard.rb:17:in `populate!'
    /opt/gitlab/embedded/service/gitlab-rails/config/initializers/fill_shards.rb:9:in `<top (required)>'
    /opt/gitlab/embedded/service/gitlab-rails/config/environment.rb:7:in `<top (required)>'
    <internal:/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
    <internal:/opt/gitlab/embedded/lib/ruby/site_ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:37:in `require'
    /opt/gitlab/embedded/bin/bundle:25:in `load'
    /opt/gitlab/embedded/bin/bundle:25:in `<main>'

#381585 (comment 2099732256)

The first thing is I would expect rake db:migrate:geo to exit this fill_shards.rb initializer early due to this return if Gitlab::Database.read_only? guard. Since it didn't exit early, we know that read_only? returned false. This can happen if Gitlab::Geo.secondary? returns false. Perhaps the secondary site doesn't know it is a secondary? (The gitlab_rails['geo_node_name'] needs to match a geo_nodes row's name field and that row needs to have primary: false.)

The second thing is, Shard.populate! attempts to insert rows to the shards table only if the shards table is not already populated to match the machine config. The machine config is repositories.storages from gitlab.yml which is generated from gitlab.rb. So if the primary site already populated the storages, and the secondary site has the same storages configuration, then rake db:migrate:geo won't attempt to insert to the shards table.

The above original error relates to the Gitaly storage configuration being different on the primary/secondary site, the original error has no relation to this and is failing because the secondary site is trying to update the storage configuration in the secondary database which is read-only. It seems the secondary should never try to update this and should instantly fail because of the difference.

Edited by 🤖 GitLab Bot 🤖