GitLab 18.9 upgrade failure: ValidateUserAgentDetailsOrganizationIdNullConstraint migration crashes on user_agent_details
<!--IssueSummary start-->
<details>
<summary>
Everyone can contribute. [Help move this issue forward](https://handbook.gitlab.com/handbook/marketing/developer-relations/contributor-success/community-contributors-workflows/#contributor-links) while earning points, leveling up and collecting rewards.
</summary>
- [Close this issue](https://contributors.gitlab.com/manage-issue?action=close&projectId=278964&issueIid=591051)
</details>
<!--IssueSummary end-->
### Environment
- GitLab EE, upgraded from 18.8.4 to 18.9.0
- PostgreSQL 16
- Omnibus package on Debian 13
### Symptoms
Running `gitlab-rake db:migrate` after upgrading to 18.9.0 fails with:
```bash
PG::CheckViolation: ERROR: check constraint "check_17a3a18e31" of relation "user_agent_details" is violated by a row
```
The migration `20260126183256 ValidateUserAgentDetailsOrganizationIdNullConstraint` attempts to validate a NOT NULL constraint on `user_agent_details.organization_id`, but the column doesn't exist yet because a prior migration that should have added it didn't run cleanly.
### Investigation
- The column `organization_id` was absent from `user_agent_details` at migration time
- The constraint `check_17a3a18e31` didn't exist in `pg_constraint` either
- Manually adding the column and populating it with a default value (1) worked at the psql level
- Manually creating the constraint as NOT VALID also worked at the psql level
- Running `ALTER TABLE user_agent_details VALIDATE CONSTRAINT check_17a3a18e31` directly in psql succeeded
- However, Rails was seeing a different OID for the constraint than psql suggesting a stale connection or internal cache issue
- Inserting the migration version into `schema_migrations` manually had no effect, the migration kept re-running
- The root cause appears to be that the prior migration responsible for adding `organization_id` didn't run cleanly, leaving the schema in a state the validation migration cannot recover from on its own
### Workaround
First, manually fix the schema in psql:
```sql
ALTER TABLE user_agent_details ADD COLUMN organization_id bigint;
UPDATE user_agent_details SET organization_id = 1;
ALTER TABLE user_agent_details ADD CONSTRAINT check_17a3a18e31 CHECK (organization_id IS NOT NULL) NOT VALID;
ALTER TABLE user_agent_details VALIDATE CONSTRAINT check_17a3a18e31;
```
Then patch the migration file to a no-op, then re-run `db:migrate` :
```bash
cat > /opt/gitlab/embedded/service/gitlab-rails/db/post_migrate/20260126183256_validate_user_agent_details_organization_id_null_constraint.rb << 'EOF'
# frozen_string_literal: true
class ValidateUserAgentDetailsOrganizationIdNullConstraint < Gitlab::Database::Migration[2.3]
milestone '18.9'
def up
# patched: constraint already validated manually
end
def down
# no-op
end
end
EOF
gitlab-ctl stop puma
gitlab-ctl stop sidekiq
gitlab-rake db:migrate
gitlab-ctl reconfigure
gitlab-ctl restart
```
### Expected behavior
The migration should either handle the missing column gracefully, or the prior migration adding organization_id should guarantee the column exists before the validation migration runs.
issue