Avoid access exclusive locks in SET NOT NULL migrations
From gitlab-com/gl-infra/production#1433 (comment 254165845), we had a migration (https://gitlab.com/gitlab-org/gitlab/blob/903713efbf84e260977a691a7608d3199320e628/db/migrate/20191112090226_add_artifacts_to_ci_build_need.rb#L14) that attempted to run:
ALTER TABLE 'ci_build_needs' ALTER COLUMN 'artifacts' SET NOT NULL
This should have been fast because the add_column_with_default
migration helper attempted to set all non-null values to the default true
value, but for some reason this locked the table for about 8-10 minutes before it completed, probably because it still had to do a full table scan of 9 million rows.
@gerardo.herzig proposes:
To avoid the ACCESS EXCLUSIVE LOCK acquired by SET NOT NULL
constraint, an alternative that does not block writes:
ALTER TABLE ci_build_needs ADD CONSTRAINT ci_builds_needs_artifacts_not_null
CHECK (artifacts IS NOT NULL)
NOT VALID;
And then:
ALTER TABLE ci_builds_needs VALIDATE CONSTRAINT ci_builds_needs_artifacts_not_null;
That requires only a SHARE UPDATE EXCLUSIVE
lock, which allows normal activity (INSERT/UPDATES/DELETES).
Differences between a column NOT NULL constraint and a CHECK CONSTRAINT not null
The end result may be the same, but there are still a few differences:
1. check constraints have to be named and belong to the table, while a NOT NULL column is just an option of the latter 2. from a performance point of view, PostgreSQL documentation says that in PostgreSQL creating an explicit not-null constraint is more efficient.
It seems to me that we should probably favor these constraints for large tables to avoid the table lock?
Thoughts @abrandl, @yorickpeterse?