Skip to content

Explore how to partition new tables when dynamic partitions are automatically created

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

In &11806 we have added an automated way of creating new partitions for the routing tables. This will insert new records in the ci_partitions table and bump the writes to the next available partition when the any of the current ones reach 100GB. This is a problem for partitioning new tables in the CI domain. The partitioning strategy is designed to have all resources from a pipeline object share the same partition_id value.(for example all all the jobs from a pipeline must have the same partition_id value as the pipeline)

To partition an existing table with zero downtime, we must add a check constraint on that table that enforces all the rows from it to have a limited set of partition_ids. For example, when we partitioned ci_builds we added this constraint: CONSTRAINT partitioning_constraint CHECK (partition_id = (100)::bigint) in order to attach the table to the routing table without holding the expensive locks for too long. But when we got to partition ci_job_artifacts we had to add this check constraint: CONSTRAINT partitioning_constraint CHECK ((partition_id = ANY (ARRAY[(100)::bigint, (101)::bigint]))) because at that time we were writing data to the second partition(ci_builds_101) for p_ci_builds and the ci_job_artifacts had records with both partition_id=100 and partition_id=101. For ci_pipelines we'll have to add a check constraint for partition_id in (100, 101, 102) because we have created the partitions for partition_id=102 for the current partitioned table and we plan to write to them.

But if we enable the partition manager, it will start creating new partitions as soon as a partition from the current partitioned tables reaches the size limit. Example:

  • the partition manager is enabled
  • the current partition_id=102
  • we add CONSTRAINT partitioning_constraint CHECK ((partition_id = ANY (100, 101, 102))) to ci_pipelines
  • ci_builds_metadata_102 reaches the threshold for switching to the next partition
  • ci_builds_103, ci_builds_metadata_103, ..., partitions are created
  • the partition manager switches the writes to partition_id=103
  • now inserts into ci_pipelines start to fail because the partitioning check constraint is not valid.

We must wait until ci_pipelines is partitioned and we can create partitions for the next values.

reference list

Proposals

  • Pause partition manager when a new table is being partitioned and re-enable it after partitioning is complete
  • when adding the check constraint include the next X values so that the check constraint is still valid. Switching to the next partition should still work, but how do we determine a good X value?
  • partition new tables by creating the table as partitioned and copy the data over.

There is also the problem with the growth rate between .com and self-managed when selecting the values for the constraint. On .com we know the values, but for self-managed we have no idea and we'll have to take them out of the database which will increase the complexity of the migrations.

Edited by 🤖 GitLab Bot 🤖