Skip to content

Fix rename column concurrently

Patrick Bair requested to merge pb-fix-rename-column-concurrently into master

What does this MR do?

Related issue: #297378 (closed)

Reimplement the rename_column_concurrently helper to correctly sync data between the old and new column during no-downtime deployments. This requires using triggers that copy data bidirectionally between the old and new columns.

This is technically a bug fix of the existing helper, so the reasons for a reimplementation are:

  1. Support for columns with default values is tricky (more on this further down). Adding support for defaulted columns at this time adds significant complexity, and the current bug is blocking !50824 (merged). At the same time, the existing helper can't be replaced without support for defaulted columns, as existing migrations depend on it.

  2. The existing helper has been used a number of times in previous migrations. While that helper has a known issue, it only affects no-downtime deployments. Since no-downtime deployments require only upgrading a single version at a time, fixing the existing implementation won't fix those previous migrations (without backporting). As a result, there doesn't seem to be sense in risking breaking existing migrations to fix a bug that won't affect the installations that could benefit from the new code.

Default Values

As mentioned above, support for default values may be complicated. Even in a before trigger, default values are populated before the trigger fires. On an update trigger, the default value has no impact, because we know which of the columns was specified as an update target.

On an insert trigger, it currently works under the assumption that only one of the columns will be set by the application, so the column with a NULL value should be set to the value of the column that doesn't have the NULL value. This assumption breaks with default values, because neither column would be NULL.

Originally I thought we can use a view or remove the default value from the column, and set it manually in a trigger during the migration, which has effectively the same result. However, this won't work, as Rails uses the default value on a column to default attributes on a model object when it is initialized, provided the default is a constant value.

Imagine a table with a status column like:

status smallint not null default 0

and the application, some validation like:

validates :status, inclusion: { in: STATUSES }

If we remove the default value on the column, we can cause the validation to fail, even if the value would be set properly in the database.

I think we have two scenarios if the column has a default, which need to be handled differently. We'd have to check the default at the time the migration runs, and then:

  1. the default is a constant value - add additional checks in the insert trigger, where we inspect incoming values. If one column is set to the default, and the other is not, we can assume the non-defaulted value is the correct value.

  2. the default is an expression - Rails doesn't rely on this, so we can possibly remove the column default for the duration of the migration, set the default value manually in the insert trigger if necessary, and then add the column default back when the migration completes. This is also a rare enough occurrence in our existing schema that it may not be worth solving at all.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Patrick Bair

Merge request reports