Skip to content

Add NOT NULL constraint to gitlab_subscriptions namespace_id

What does this MR do?

Summary

This work comes from issue #243496 (closed).It also related to #214434 (closed).

There are several changes in this MR:

  1. Add not null constraint to gitlab_subscriptions namespace_id
  2. Remove always-true code logic hosted? from GitlabSubscription model
  3. Remove not-possible context 'when namespace is absent' from rspec ee/spec/workers/update_max_seats_used_for_gitlab_com_subscriptions_worker_spec.rb
  4. Update several rspec file, to make sure create(:gitlab_subscription) will always have a namespace associated. So it won't break the not null constraint of gitlab_subscription's namespace_id.

migration output

$ bundle exec rails db:migrate
== 20210303064112 AddNotNullConstraintsToGitlabSubscriptionsNamespaceId: migrating 
-- current_schema()
   -> 0.0003s
-- transaction_open?()
   -> 0.0000s
-- current_schema()
   -> 0.0003s
-- execute("ALTER TABLE gitlab_subscriptions\nADD CONSTRAINT check_77fea3f0e7\nCHECK ( namespace_id IS NOT NULL )\nNOT VALID;\n")
   -> 0.0012s
== 20210303064112 AddNotNullConstraintsToGitlabSubscriptionsNamespaceId: migrated (0.0229s) 

== 20210303064142 CleanupGitlabSubscriptionsWithNullNamespaceId: migrating ====
== 20210303064142 CleanupGitlabSubscriptionsWithNullNamespaceId: migrated (0.0149s) 
$ bundle exec rails db:rollback
== 20210303064142 CleanupGitlabSubscriptionsWithNullNamespaceId: reverting ====
== 20210303064142 CleanupGitlabSubscriptionsWithNullNamespaceId: reverted (0.0000s) 

$ bundle exec rails db:rollback
== 20210303064112 AddNotNullConstraintsToGitlabSubscriptionsNamespaceId: reverting 
-- execute("ALTER TABLE gitlab_subscriptions\nDROP CONSTRAINT IF EXISTS check_77fea3f0e7\n")
   -> 0.0009s
== 20210303064112 AddNotNullConstraintsToGitlabSubscriptionsNamespaceId: reverted (0.0141s) 

The RAW SQL and Query plan

There is a new query GitlabSubscription.where('namespace_id IS NULL').delete_all in post deployment migration script db/post_migrate/20210303064142_cleanup_gitlab_subscriptions_with_null_namespace_id.rb.

The raw sql query is:

DELETE FROM "gitlab_subscriptions" WHERE (namespace_id IS NULL) 

Note: as in the comment https://gitlab.com/gitlab-org/gitlab/-/blob/b99d41b32144117f919c3473c7e91fa17cb3b806/db/post_migrate/20210303064142_cleanup_gitlab_subscriptions_with_null_namespace_id.rb#L13-19, we expect 0 rows whose namespace_id is null on GitLab.com.

The query plan link https://console.postgres.ai/shared/9ab62af4-8ba4-44f2-9407-213e8f72d57c.

Plan with execution:

 ModifyTable on public.gitlab_subscriptions  (cost=0.43..2.85 rows=1 width=6) (actual time=4.613..4.614 rows=0 loops=1)
   Buffers: shared read=4
   I/O Timings: read=4.570
   ->  Index Scan using index_gitlab_subscriptions_on_namespace_id on public.gitlab_subscriptions  (cost=0.43..2.85 rows=1 width=6) (actual time=4.611..4.612 rows=0 loops=1)
         Index Cond: (gitlab_subscriptions.namespace_id IS NULL)
         Buffers: shared read=4
         I/O Timings: read=4.570

Plan summary:

Time: 5.660 ms
  - planning: 0.972 ms
  - execution: 4.688 ms
    - I/O read: 4.570 ms
    - I/O write: N/A

Shared buffers:
  - hits: 0 from the buffer pool
  - reads: 4 (~32.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Details of the changes

The goal of issue #243496 (closed), is to reduce the data integrity risk. We can achieve this goal by:

  • create unique index for namespace_id in gitlab_subscriptions table
  • avoid orphaned gitlab_subscriptions where it's namespace_id is NULL

As confirmed in comment #243496 (comment 509258896), gitlab_subscription table already has unique index on namespace_id. In db/structure.sql:

CREATE UNIQUE INDEX index_gitlab_subscriptions_on_namespace_id ON gitlab_subscriptions USING btree (namespace_id);

This MR is to add NOT NULL for namespace_id column in gitlab_subscriptions table.

This MR follows the suggested approach in document https://docs.gitlab.com/ee/development/database/not_null_constraints.html#add-a-not-null-constraint-to-an-existing-column

  • Ensure the constraint is enforced at the application level (i.e. add a model validation).
  • Add a post-deployment migration to add the NOT NULL constraint with validate: false.
  • Add a post-deployment migration to fix the existing records -- NOTE: searched from database lab, today there is 0 orphaned gitlab_subscriptions(namespace_id is nil) in the GitLab.com production server. But I think it is no harm to have this cleanup post-migration script to ensure.

Besides the database schema change, this MR also removed some useless code logic:

  • it removes the hosted? check in GitlabSubscription model.
    • GitlabSubscription is only used for Gitlab.com, there is no need to check hosted?. As it will always be hosted
    • The way it checked hosted? was namespace_id.present?. This also does not make sense. As we manipulate GitlabSubscription from Namespace(through namespace.gitlab_subscription) --- if a gitlab_subscription has NULL namespace_id, we would never reach that gitlab_subscription. So, as long as we get a namespace.gitlab_subscripton, the statement namespace_id.present? will always return true.
    • the rspec of GitlabSubscription has some test cases try to create a gitlab_subscription with NULL namespace_id. This is not possible with the not null constraint added by this MR.
  • it removes the context 'when namespace is absent' from rspec ee/spec/workers/update_max_seats_used_for_gitlab_com_subscriptions_worker_spec.rb. As it is not possible to create a gitlab_subscription with NULL namespace_id now.
    • I wonder we may remove the corresponding code logic unless subscription.namespace from ee/app/workers/update_max_seats_used_for_gitlab_com_subscriptions_worker.rb. But for now it is no harm to leave it. We can remove this in next milestone -- after this MR applied, there will never be any chance to have a gitlab_subscription with NULL namespace_id.

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Mayra Cabrera

Merge request reports