GitLab 18.9 upgrade failure: ValidateUserAgentDetailsOrganizationIdNullConstraint migration crashes on user_agent_details
<!--IssueSummary start--> <details> <summary> Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards. </summary> - [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=591051) </details> <!--IssueSummary end--> ### Environment - GitLab EE, upgraded from 18.8.4 to 18.9.0 - PostgreSQL 16 - Omnibus package on Debian 13 ### Symptoms Running `gitlab-rake db:migrate` after upgrading to 18.9.0 fails with: ```bash PG::CheckViolation: ERROR: check constraint "check_17a3a18e31" of relation "user_agent_details" is violated by a row ``` The migration `20260126183256 ValidateUserAgentDetailsOrganizationIdNullConstraint` attempts to validate a NOT NULL constraint on `user_agent_details.organization_id`, but the column doesn't exist yet because a prior migration that should have added it didn't run cleanly. ### Investigation - The column `organization_id` was absent from `user_agent_details` at migration time - The constraint `check_17a3a18e31` didn't exist in `pg_constraint` either - Manually adding the column and populating it with a default value (1) worked at the psql level - Manually creating the constraint as NOT VALID also worked at the psql level - Running `ALTER TABLE user_agent_details VALIDATE CONSTRAINT check_17a3a18e31` directly in psql succeeded - However, Rails was seeing a different OID for the constraint than psql suggesting a stale connection or internal cache issue - Inserting the migration version into `schema_migrations` manually had no effect, the migration kept re-running - The root cause appears to be that the prior migration responsible for adding `organization_id` didn't run cleanly, leaving the schema in a state the validation migration cannot recover from on its own ### Workaround First, manually fix the schema in psql: ```sql ALTER TABLE user_agent_details ADD COLUMN organization_id bigint; UPDATE user_agent_details SET organization_id = 1; ALTER TABLE user_agent_details ADD CONSTRAINT check_17a3a18e31 CHECK (organization_id IS NOT NULL) NOT VALID; ALTER TABLE user_agent_details VALIDATE CONSTRAINT check_17a3a18e31; ``` Then patch the migration file to a no-op, then re-run `db:migrate` : ```bash cat > /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20260126183256_validate_user_agent_details_organization_id_null_constraint.rb << 'EOF' # frozen_string_literal: true class ValidateUserAgentDetailsOrganizationIdNullConstraint < Gitlab::Database::Migration[2.3] milestone '18.9' def up # patched: constraint already validated manually end def down # no-op end end EOF gitlab-ctl stop puma gitlab-ctl stop sidekiq gitlab-rake db:migrate gitlab-ctl reconfigure gitlab-ctl restart ``` ### Expected behavior The migration should either handle the missing column gracefully, or the prior migration adding organization_id should guarantee the column exists before the validation migration runs.
issue