Fix schema helpers for restricted migrations
What does this MR do and why?
Related to #356276 (closed)
Update a couple migration helpers for concurrent schema changes that mix DML and DDL operations. This will no longer be allowed under Gitlab::Database::Migration[2.0]
, due to the decomposition effort.
In this case, the helpers rely on create_column_from
which creates and populates a new column from an existing column. Since update_column_in_batches
is used to populate the new column in the migration, it violates the rule of running DML during a DDL-only migration.
To fix this, require the helpers to run in DDL mode, which is necessary to keep the schema in sync between multiple databases. At the same time, we can suppress the error caused by update_column_in_batches
running DML to populate the column.
How to set up and validate locally
-
Create a new migration
db/post_migrate/20220412193755_rename_project_updated_at_column.rb
which uses one of these helpers:# frozen_string_literal: true class RenameProjectUpdatedAtColumn < Gitlab::Database::Migration[2.0] disable_ddl_transaction! def up rename_column_concurrently :projects, :updated_at, :last_changed_at end def down undo_rename_column_concurrently :projects, :updated_at, :last_changed_at end end
-
Run the migration on
master
, which should give this error:Gitlab::Database::QueryAnalyzers::RestrictAllowedSchemas::DMLNotAllowedError: Select/DML queries (SELECT/UPDATE/DELETE) are disallowed in the DDL (structure) mode. Modifying of '["projects"]' ([:gitlab_main]) with 'SELECT COUNT(*) AS count FROM "projects" /*application:web,db_config_name:main,line:/lib/gitlab/database/migration_helpers.rb:519:in `update_column_in_batches'*/'
-
Cleanup the database by removing the column (it should only be on
main
since the migration failed):alter table projects drop column last_changed_at;
-
Re-run the migration on this branch, and you should the migration output for both
main
andci
. Also, assuming yourmain
database has data inprojects
andci
doesn't, you'll see the batching forupdate_column_in_batches
:Output from main
-- exec_query("SELECT COUNT(*) AS count FROM \"projects\"") -> 0.0055s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" ORDER BY \"projects\".\"id\" ASC LIMIT 1") -> 0.0012s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 1 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0011s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 1 AND \"projects\".\"id\" < 2") -> 0.0023s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 2 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0009s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 2 AND \"projects\".\"id\" < 3") -> 0.0011s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 3 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0008s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 3 AND \"projects\".\"id\" < 4") -> 0.0011s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 4 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0009s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 4 AND \"projects\".\"id\" < 5") -> 0.0011s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 5 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0010s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 5 AND \"projects\".\"id\" < 6") -> 0.0011s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 6 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0009s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 6 AND \"projects\".\"id\" < 7") -> 0.0012s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 7 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0012s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 7 AND \"projects\".\"id\" < 8") -> 0.0011s -- exec_query("SELECT \"projects\".\"id\" FROM \"projects\" WHERE \"projects\".\"id\" >= 8 ORDER BY \"projects\".\"id\" ASC LIMIT 1 OFFSET 1") -> 0.0009s -- execute("UPDATE \"projects\" SET \"last_changed_at\" = \"projects\".\"updated_at\" WHERE \"projects\".\"id\" >= 8")
Output from ci
-- exec_query("SELECT COUNT(*) AS count FROM \"projects\"") -> 0.0055s
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.