Ci::DeleteObjectsWorker fails with PG::CheckViolation: ERROR: new row for relation "ci_deleted_objects" violates check constraint "check_98f90d6c53"

Summary

GitLab 17.4.0 update introduced several database migrations, including adding "check_98f90d6c53": execute("ALTER TABLE ci_deleted_objects\nADD CONSTRAINT check_98f90d6c53\nCHECK ( project_id IS NOT NULL )\nNOT VALID;\n")

this check was added with "NOT VALID" clause - it is not applied to existing data, only new data is checked, including updates.

Sidekiq Ci::DeleteObjectsWorker however does UPDATE on this table as part of his job. SQL query: UPDATE "ci_deleted_objects" SET "pick_up_at" = $1 WHERE... which changes only pick_up_at, but fails on existing data in database with project_id = NULL. This leads to huge spam in sentry, as this Worker runs every 16 minutes, and increased use of storage, as workers which would usually pick up artifacts for deletion now crash every time.

Relevant logs and/or screenshots

https://sentry.ict.fit.cvut.cz/share/issue/bf332fbb596044f481b1f0a78e645b0d/

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

System information
System:		Ubuntu 22.04
Proxy:		no
Current User:	git
Using RVM:	no
Ruby Version:	3.1.5p253
Gem Version:	3.5.17
Bundler Version:2.5.11
Rake Version:	13.0.6
Redis Version:	7.0.15
Sidekiq Version:7.2.4
Go Version:	unknown

GitLab information
Version:	17.4.1-ee
Revision:	40bdc966046
Directory:	/opt/gitlab/embedded/service/gitlab-rails
DB Adapter:	PostgreSQL
DB Version:	14.11
URL:		https://gitlab.fit.cvut.cz
HTTP Clone URL:	https://gitlab.fit.cvut.cz/some-group/some-project.git
SSH Clone URL:	git@gitlab.fit.cvut.cz:some-group/some-project.git
Elasticsearch:	no
Geo:		no
Using LDAP:	yes
Using Omniauth:	no

GitLab Shell
Version:	14.39.0
Repository storages:
- default: 	unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path:		/opt/gitlab/embedded/service/gitlab-shell

Gitaly
- default Address: 	unix:/var/opt/gitlab/gitaly/gitaly.socket
- default Version: 	17.4.1
- default Git Version: 	2.46.0

Update

It seems that fix was applied in a migration which is executed before the "broken" migration, so the bug is still present on some environments which updated directly to version with fix and didn't go through broken version first.

Workaround

Execute gitlab-rake db:migrate:up:ci RAILS_ENV=production VERSION=20241028085044 to update the invalid records and then re-run the migrations. (if you receive an error while executing this command for invalid rake tasks then opt for remove the constraint instead as indicated below)

Edited by Mario Mora