Add NOT NULL for sharding key on ml_experiment_metadata
What does this MR do and why?
Add NOT NULL constraint to ml_experiment_metadata.project_id column.
- We do not add a
presence: true
validation in the model because we're populating the column using a "BEFORE INSERT OR UPDATE" trigger. When an object is created in rails, the attribute might be nil but it will be filled in the database transaction before being committed. - We have finalized the backfill migration for this column.
The sharding key will be used to determine what Organization the row belongs to.
References
- Organization Isolation and Sharding Keys: https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/organization/isolation/
- Sharding Key (Cells) Development Guide: https://docs.gitlab.com/ee/development/cells/#defining-a-sharding-key-for-all-cell-local-tables
- Related to #493768
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
N/A
How to set up and validate locally
N/A
Merge request reports
Activity
changed milestone to %17.8
assigned to @smaglangit
added pipelinetier-1 label
added database databasereview pending labels
1 Warning You've made some app changes, but didn't add any tests.
That's OK as long as you're refactoring existing code,
but please consider adding any of the maintenancepipelines, maintenancerefactor, maintenanceworkflow, maintenanceperformance, documentation, QA labels.Reviewer roulette
Category Reviewer Maintainer database @acroitor
(UTC+2, 6 hours behind author)
@pshutsin
(UTC+1, 7 hours behind author)
Please refer to documentation page for guidance on how you can benefit from the Reviewer Roulette, or use the GitLab Review Workload Dashboard to find other available reviewers.
If needed, you can retry the
danger-review
job that generated this comment.Generated by
DangerEdited by ****added Data WarehouseImpact Check label
Started database testing pipeline (limited access). This comment will be updated once the pipeline has finished running.
Database migrations (on the main database)
Migrations included in this change have been executed on gitlab.com data for testing purposes. For details, please see the migration testing pipeline (limited access).
Migration Type Total runtime Result DB size change 20250108115531 - AddMlExperimentMetadataProjectIdNotNullConstraint Post deploy 6.4 s +0.00 B Runtime Histogram for all migrations
Query Runtime Count 0 seconds - 0.01 seconds 0 0.01 seconds - 0.1 seconds 0 0.1 seconds - 1 second 5 1 second - 5 seconds 0 5 seconds - 15 seconds 0 15 seconds - 5 minutes 0 5 minutes + 0 Migration: 20250108115531 - AddMlExperimentMetadataProjectIdNotNullConstraint
* Duration: 6.4 s
* Database size change: +0.00 BCalls Total Time Max Time Mean Time Rows Query 1 0.9 ms 0.9 ms 0.9 ms 0 ALTER TABLE ml_experiment_metadata ADD CONSTRAINT check_ca9b8315ef CHECK ( project_id IS NOT NULL ) NOT VALID
1 0.7 ms 0.7 ms 0.7 ms 1 SELECT c.is_nullable
FROM information_schema.columns c
WHERE c.table_schema = $1 AND c.table_name = $2 AND c.column_name = $31 0.5 ms 0.5 ms 0.5 ms 0 ALTER TABLE ml_experiment_metadata VALIDATE CONSTRAINT check_ca9b8315ef
2 0.0 ms 0.0 ms 0.0 ms 2 SELECT pg_backend_pid()
Histogram for AddMlExperimentMetadataProjectIdNotNullConstraint
Query Runtime Count 0 seconds - 0.01 seconds 0 0.01 seconds - 0.1 seconds 0 0.1 seconds - 1 second 5 1 second - 5 seconds 0 5 seconds - 15 seconds 0 15 seconds - 5 minutes 0 5 minutes + 0 Background Migration: SkipCompanyOnboardingStep
Sampled 108 batches. Estimated Time to complete: 1 week, 1 day, 12 hours, and 36 minutes
- Interval: 120s
- Max batch size: 10000
- Estimated seconds to complete: 736560s
- Average batch time: 6.84s
- Batch size: 3000
- N. of batches sampled: 108
- N. of failed batches: 0
Time estimation is conservative and based on sampling production data in a test environment. It represents the max time that migration could take. The actual time may differ from this estimation.
Consider changing max_batch_size and interval if this estimate is unacceptable.
Calls Total Time Max Time Mean Time Rows Query 1177 115452.1 ms 235.7 ms 98.1 ms 755 UPDATE user_details
SET onboarding_status = onboarding_status || jsonb_build_object($1, $2)
WHERE user_details.user_id IN (
SELECT users.id
FROM users
WHERE users.id BETWEEN $3 AND $4 AND users.id >= $5 AND users.id < $6 AND users.onboarding_in_progress = $7
) AND NOT jsonb_exists(user_details.onboarding_status, $8) AND jsonb_path_exists(user_details.onboarding_status, $9)108 10932.9 ms 202.5 ms 101.2 ms 50 UPDATE user_details
SET onboarding_status = onboarding_status || jsonb_build_object($1, $2)
WHERE user_details.user_id IN (
SELECT users.id
FROM users
WHERE users.id BETWEEN $3 AND $4 AND users.id >= $5 AND users.onboarding_in_progress = $6
) AND NOT jsonb_exists(user_details.onboarding_status, $7) AND jsonb_path_exists(user_details.onboarding_status, $8)1285 223.6 ms 13.4 ms 0.2 ms 1177 SELECT users.id
FROM users
WHERE users.id BETWEEN $1 AND $2 AND users.id >= $3
ORDER BY users.id ASC
LIMIT $4
OFFSET $5216 32.5 ms 1.2 ms 0.2 ms 216 INSERT INTO batched_background_migration_job_transition_logs (batched_background_migration_job_id, created_at, updated_at, previous_status, next_status) VALUES ($1, $2, $3, $4, $5) RETURNING id
108 16.1 ms 0.9 ms 0.1 ms 108 UPDATE batched_background_migration_jobs
SET updated_at = $1, finished_at = $2, status = $3, metrics = $4
WHERE batched_background_migration_jobs.id = $5108 13.8 ms 0.8 ms 0.1 ms 108 UPDATE batched_background_migration_jobs
SET updated_at = $1, started_at = $2, status = $3, attempts = $4
WHERE batched_background_migration_jobs.id = $5108 7.8 ms 0.2 ms 0.1 ms 108 SELECT sum(batched_background_migration_jobs.batch_size)
FROM batched_background_migration_jobs
WHERE batched_background_migration_jobs.batched_background_migration_id = $1 AND batched_background_migration_jobs.status IN ($2)216 8.9 ms 0.1 ms 0.0 ms 216 SELECT batched_background_migration_jobs.*
FROM batched_background_migration_jobs
WHERE batched_background_migration_jobs.id = $1
LIMIT $2108 3.9 ms 0.1 ms 0.0 ms 108 SELECT users.id
FROM users
WHERE users.id BETWEEN $1 AND $2
ORDER BY users.id ASC
LIMIT $3Histogram of batch runtimes for SkipCompanyOnboardingStep
Batch Runtime Count 0 seconds - 10 seconds 108 10 seconds - 1 minute 0 1 minute - 2 minutes 0 2 minutes - 3 minutes 0 3 minutes - 5 minutes 0 5 minutes + 0 Histogram across all sampled batches of SkipCompanyOnboardingStep
Query Runtime Count 0 seconds - 0.1 seconds 0 0.1 seconds - 0.5 seconds 3433 0.5 seconds - 1 second 1 1 second - 2 seconds 0 2 seconds - 5 seconds 0 5 seconds + 0 Other migrations pending on GitLab.com
Migration Type Total runtime Result DB size change 20241219144523 - QueueSkipCompanyOnboardingStep Post deploy 7.4 s +0.00 B 20241226131806 - FinalizeMigrateOsSbomOccurrencesToComponentsWithoutPrefix Post deploy 5.7 s +0.00 B 20241231055516 - CleanupProjectSettingPagesDefaultDomainRedirectRename Post deploy 5.9 s +0.00 B Clone details
Clone ID Clone Created At Clone Data Timestamp Expected Removal Time database-testing-4072061-16738226-main
2025-01-10T04:00:55Z 2025-01-08T23:12:29Z 2025-01-10 16:30:35 +0000 database-testing-4072061-16738226-ci
2025-01-10T04:00:57Z 2025-01-09T22:49:28Z 2025-01-10 16:30:35 +0000 Database migrations (on the ci database)
Migrations included in this change have been executed on gitlab.com data for testing purposes. For details, please see the migration testing pipeline (limited access).
Migration Type Total runtime Result DB size change 20250108115531 - AddMlExperimentMetadataProjectIdNotNullConstraint Post deploy 9.3 s +0.00 B Runtime Histogram for all migrations
Query Runtime Count 0 seconds - 0.01 seconds 0 0.01 seconds - 0.1 seconds 0 0.1 seconds - 1 second 5 1 second - 5 seconds 0 5 seconds - 15 seconds 0 15 seconds - 5 minutes 0 5 minutes + 0 Migration: 20250108115531 - AddMlExperimentMetadataProjectIdNotNullConstraint
* Duration: 9.3 s
* Database size change: +0.00 BCalls Total Time Max Time Mean Time Rows Query 1 7.4 ms 7.4 ms 7.4 ms 1 SELECT c.is_nullable
FROM information_schema.columns c
WHERE c.table_schema = $1 AND c.table_name = $2 AND c.column_name = $31 1.6 ms 1.6 ms 1.6 ms 0 ALTER TABLE ml_experiment_metadata ADD CONSTRAINT check_ca9b8315ef CHECK ( project_id IS NOT NULL ) NOT VALID
1 0.2 ms 0.2 ms 0.2 ms 0 ALTER TABLE ml_experiment_metadata VALIDATE CONSTRAINT check_ca9b8315ef
2 0.0 ms 0.0 ms 0.0 ms 2 SELECT pg_backend_pid()
Histogram for AddMlExperimentMetadataProjectIdNotNullConstraint
Query Runtime Count 0 seconds - 0.01 seconds 0 0.01 seconds - 0.1 seconds 0 0.1 seconds - 1 second 5 1 second - 5 seconds 0 5 seconds - 15 seconds 0 15 seconds - 5 minutes 0 5 minutes + 0 Other information
No other migrations pending on GitLab.com
Clone details
Clone ID Clone Created At Clone Data Timestamp Expected Removal Time database-testing-4072061-16738226-main
2025-01-10T04:00:55Z 2025-01-08T23:12:29Z 2025-01-10 16:30:35 +0000 database-testing-4072061-16738226-ci
2025-01-10T04:00:57Z 2025-01-09T22:49:28Z 2025-01-10 16:30:35 +0000
Brought to you by gitlab-org/database-team/gitlab-com-database-testing. Epic
Edited by ****mentioned in issue #493768
changed milestone to %17.9
added database-testing-automation label
Hi @shubhamkrai, could you please do the initial review for this? Thank you
requested review from @shubhamkrai
added pipeline:mr-approved label
added pipelinetier-2 label and removed pipelinetier-1 label
Before you set this MR to auto-merge
This merge request will progress on pipeline tiers until it reaches the last tier: pipelinetier-3. We will trigger a new pipeline for each transition to a higher tier.
Before you set this MR to auto-merge, please check the following:
- You are the last maintainer of this merge request
- The latest pipeline for this merge request is pipelinetier-3 (You can find which tier it is in the pipeline name)
- This pipeline is recent enough (created in the last 8 hours)
If all the criteria above apply, please set auto-merge for this merge request.
See pipeline tiers and merging a merge request for more details.
- Resolved by Adam Hegyi
Hello @ahegyi
Could you do the final review for this?
Generated bygitlab_quality-test_tooling
.
Slow tests detected in this merge request. These slow tests might be related to this merge request's changes.Click to expand
Job File Name Duration Expected duration #8846930374 spec/lib/gitlab/database/decomposition/migrate_spec.rb#L103
Gitlab::Database::Decomposition::Migrate#process! when the checks pass copies main database to ci database 27.24 s < 27.12 s #9005814344 spec/lib/gitlab/database/decomposition/migrate_spec.rb#L184
Gitlab::Database::Decomposition::Migrate#process! when all background migrations are finished does not raise an error 29.62 s < 27.12 s Edited by ****added rspec:slow test detected label
E2E Test Result Summary
allure-report-publisher
generated test report!e2e-test-on-gdk:
test report for d0200b5bexpand test summary
+------------------------------------------------------------------+ | suites summary | +-------------+--------+--------+---------+-------+-------+--------+ | | passed | failed | skipped | flaky | total | result | +-------------+--------+--------+---------+-------+-------+--------+ | Create | 138 | 0 | 20 | 0 | 158 | ✅ | | Secure | 4 | 0 | 3 | 0 | 7 | ✅ | | Plan | 82 | 0 | 8 | 0 | 90 | ✅ | | Data Stores | 33 | 0 | 10 | 0 | 43 | ✅ | | Govern | 79 | 0 | 13 | 0 | 92 | ✅ | | Verify | 52 | 0 | 20 | 1 | 72 | ✅ | | Package | 24 | 0 | 14 | 0 | 38 | ✅ | | Fulfillment | 2 | 0 | 7 | 0 | 9 | ✅ | | Monitor | 8 | 0 | 12 | 0 | 20 | ✅ | | Configure | 0 | 0 | 3 | 0 | 3 | ➖ | | Manage | 1 | 0 | 9 | 0 | 10 | ✅ | | Ai-powered | 0 | 0 | 2 | 0 | 2 | ➖ | | Growth | 0 | 0 | 2 | 0 | 2 | ➖ | | ModelOps | 0 | 0 | 1 | 0 | 1 | ➖ | | Release | 5 | 0 | 1 | 0 | 6 | ✅ | | Analytics | 2 | 0 | 0 | 0 | 2 | ✅ | +-------------+--------+--------+---------+-------+-------+--------+ | Total | 430 | 0 | 125 | 1 | 555 | ✅ | +-------------+--------+--------+---------+-------+-------+--------+
e2e-test-on-cng:
test report for d0200b5bexpand test summary
+------------------------------------------------------------------+ | suites summary | +-------------+--------+--------+---------+-------+-------+--------+ | | passed | failed | skipped | flaky | total | result | +-------------+--------+--------+---------+-------+-------+--------+ | Plan | 86 | 0 | 8 | 0 | 94 | ✅ | | Verify | 53 | 0 | 19 | 0 | 72 | ✅ | | Monitor | 8 | 0 | 12 | 0 | 20 | ✅ | | Create | 143 | 0 | 19 | 0 | 162 | ✅ | | Govern | 84 | 0 | 10 | 0 | 94 | ✅ | | Package | 29 | 0 | 15 | 0 | 44 | ✅ | | Secure | 2 | 0 | 5 | 0 | 7 | ✅ | | Data Stores | 33 | 0 | 10 | 0 | 43 | ✅ | | Fulfillment | 2 | 0 | 7 | 0 | 9 | ✅ | | Manage | 1 | 0 | 9 | 0 | 10 | ✅ | | Configure | 0 | 0 | 3 | 0 | 3 | ➖ | | Release | 5 | 0 | 1 | 0 | 6 | ✅ | | ModelOps | 0 | 0 | 1 | 0 | 1 | ➖ | | Ai-powered | 0 | 0 | 2 | 0 | 2 | ➖ | | Growth | 0 | 0 | 2 | 0 | 2 | ➖ | | Analytics | 2 | 0 | 0 | 0 | 2 | ✅ | +-------------+--------+--------+---------+-------+-------+--------+ | Total | 448 | 0 | 123 | 0 | 571 | ✅ | +-------------+--------+--------+---------+-------+-------+--------+
Edited by ****added pipelinetier-3 pipeline:run-e2e-omnibus-once labels and removed pipelinetier-2 label
added databaseapproved pipelinetier-2 labels and removed databasereview pending pipelinetier-3 pipeline:run-e2e-omnibus-once labels
requested review from @ahegyi
added Data WarehouseNot Impacted label and removed Data WarehouseImpact Check label
started a merge train
removed this merge request from the merge train because the pipeline did not succeed. Learn more.
started a merge train
removed this merge request from the merge train because the pipeline did not succeed. Learn more.
removed workflowready for development label
added pipelinetier-3 label and removed pipelinetier-2 label
Hi @ahegyi, would you please look into merging this again? The previous one failed because of the wrong pipeline tier. Thank you!
added this merge request to the merge train at position 3
mentioned in commit 513c086e
Hello @smaglangit
The database team is looking for ways to improve the database review process and we would love your help!
If you'd be open to someone on the database team reaching out to you for a chat, or if you'd like to leave some feedback asynchronously, just post a reply to this comment mentioning:
@gitlab-org/database-team
And someone will be by shortly!
Thanks for your help!
This message was generated automatically. Improve it or delete it.
added workflowstaging-canary label
added workflowcanary label and removed workflowstaging-canary label
added workflowstaging label and removed workflowcanary label
added workflowproduction label and removed workflowstaging label
added workflowpost-deploy-db-staging label and removed workflowproduction label