Skip to content

S Cells 1.0: ci_runners sharding key

For Cells 1.0 we need a way to make sure that every entity in Organization 1 doesn't have links to any other Organization.

To do that, we need to add sharding keys, see more in https://docs.gitlab.com/ee/architecture/blueprints/organization/isolation.html

ci_runners table creates a challenge here:

  1. it doesn't have a sharding key yet
  2. it can belong to project/group
  3. the sharding key would be nullable because runner can be an instance runner and not belong to anything
  4. we allow users to link project runners to any other project with no restrictions
  5. There are tables which depend on the runner, e.g. runner_machines which will inherit all the same limitation, including sharding key being nullable

Proposed solution

  1. !166308 (merged): Add partition prefix to runner tokens.
  2. #493256 (closed): Add sharding_key_id column to ci_runners table
    1. %17.5: Backfill sharding_key_id
      • Update the app to populate sharding_key_id for new runners
    2. %17.6: Post-backfill cleanup (Release N+1)
    3. Optional %17.6: if we take the approach of only copying valid records to the partitioned table.
      1. Delete invalid records.
      2. Delete project and group runners belonging to n... (#498019 - closed)
      3. Validate sharding key constraints.
  3. #500447 (closed): Create routing table and partitions
    1. Create ci_runners_e59bb2812d routing table (exempt from sharding), and its 3 partitions:
      • instance_type_ci_runners_e59bb2812d will be attached to partition 1, and will be exempt from sharding.
      • group_type_ci_runners_e59bb2812d will be attached to partition 2, and will be sharded by namespace: namespace_id.
      • project_type_ci_runners_e59bb2812d will be attached to partition 3, and will be sharded by project: project_id.
    2. Mimic LFK definitions from ci_runners in config/gitlab_loose_foreign_keys.yml for the new ci_runners_e59bb2812d.
    3. Add missing FK definition between ci_runners_e59bb2812d and ci_runner_projects (resolves #369084 (closed) for the partitioned table)
    4. Create triggers to keep ci_runners and each of the partitioned tables in sync.
  4. See if we want to enforce unique encrypted tokens across partitions with a trigger: http://blog.ioguix.net/postgresql/2015/02/05/Partitionning-and-constraints-part-1.html.
  5. Copy data to partitioned table (enqueue_partitioning_data_migration).
  6. Add foreign key between ci_runner_projects and ... (#369084 - closed)
  7. Ensure sharding_key_id values are correctly maintained:
  8. Adapt FKs:
    • ci_running_builds.runner_id will need to point to ci_runners_e59bb2812d
    • Instance runners:
      • Make ci_cost_settings's fk_rails_6a70651f75 reference instance_type_ci_runners_e59bb2812d table, since this table only applies to instance runners.
    • Group runners:
      • Add an LFK pointing to namespaces.
      • Make ci_runner_namespaces's fk_rails_8767676b7a reference group_type_ci_runners_e59bb2812d table, since this table only applies to group runners.
    • Project runners:
      • Add an LFK pointing to projects.
      • ci_runner_projects doesn't have a FK pointing to ci_runners. We can add a new FK pointing to project_type_ci_runners_e59bb2812d, after
      • #504963 (closed): Call replace_with_partitioned_table and set Ci::Runner.primary_key = [:id, :runner_type]. ensuring that no records exist that point to missing project runners.
  9. Replace ci_runners with partitioned table (#504963 - closed)
  10. Replace ci_runner_machines with partitioned table (#504965 - closed)
  11. Optimize code that can take advantage of runner_type PK, such as Ci::Runner.find when we know the runner type in addition to the ID.
  12. #516060 (closed): Once everything is verified to be running as expected, drop ci_runners_archived table.
  13. !171172 (merged): Change Ci::Runner#owner_project to return project_id, instead of relying on a join with ci_project_runners.
  14. Add organization_id column to runner tables (#523694 - closed)
  15. Add logic to validate organization_id field in ... (#548139 - closed)
  16. Finalize organization_id backfill migrations (#523850 - closed)
  17. Add check_organization_id_nullness constraint c... (#523851 - closed)
  18. Drop check constraints on sharding_key_id colum... (#557226 - closed)
  19. Validate check_organization_id_nullness constra... (#523852)
  20. Column organization_id on runner tables should ... (#525293 - closed)
  21. Ignore sharding_key_id column in runner models (#547654 - closed)
  22. Drop sharding_key_id column from runner tables (#547650 - closed)
  23. Bonus: We could drop ci_runner_namespaces if we don't plan on allowing sharing group runners across multiple groups.
  24. Mark db/docs/ci_runners.yml as exempt_from_sharding: true. This will affect Org mover as it is planned to skip exempt_from_sharding tables from org mover.
  25. Open a follow-up issue to address impact on Org Mover