Skip to content

Add logic to create Ci::Partition automatically

Problem

Once a given partition reach a given size we should start creating a new one automatically.

To follow our database best practices we should aim for a physical table size < 100 GB so let's start with THRESHOLD=90GB.

Solution

  • Add logic to determine when a partition is above the threshold
  • Add daily cron worker to check if partition exceed the threshold
class Ci::Partition
  state_machine :status, initial: :created do
    state :created    # the tracking record is created 
    state :preparing  # we're creating partitions based on the record's value
    state :ready      # partitions are created for all tables
    state :current    # write destination, limited to 1 record
    state :active     # still accessible for reads and retries, could be used like a default scope to limit the queries
    state :inactive   # data remains in the PG cluster, but not accessible by default
    state :archived   # archived partitions, for when data is moved out of the cluster
    state :error      # in case we fail to create partitions for any of the tables

    event :switch do
      transition :ready => :current
    end

    event :deactivate do
      transition :active => :inactive
    end

    before_transition :ready => :current do
      Ci::Partition.current.active!
    end
  end
end


class Ci::Partition
  THRESHOLD = 90.gigabytes # maybe app settings? 

  # From a daily cron worker   
  if Ci::Partitionable.registered_models.any? { |model| model.partitioning_strategy.active_partition.data_size > THRESHOLD }
    Ci::Partition.next.switch!
  end
end

class Ci::Partitionable
  def self.registered_models
    Gitlab::Database::Partitioning
      .registered_models
      .select { |model| model < Ci::ApplicationRecord && model < Ci::Partitionable }
  end
end

See &11815 (comment 1780664103) for more details.

Edited by Max Orefice