Add logic to create Ci::Partition automatically
Problem
Once a given partition reach a given size we should start creating a new one automatically.
To follow our database best practices we should aim for a physical table size < 100 GB so let's start with THRESHOLD=90GB
.
Solution
-
Add logic to determine when a partition is above the threshold -
Add daily cron worker to check if partition exceed the threshold
class Ci::Partition
state_machine :status, initial: :created do
state :created # the tracking record is created
state :preparing # we're creating partitions based on the record's value
state :ready # partitions are created for all tables
state :current # write destination, limited to 1 record
state :active # still accessible for reads and retries, could be used like a default scope to limit the queries
state :inactive # data remains in the PG cluster, but not accessible by default
state :archived # archived partitions, for when data is moved out of the cluster
state :error # in case we fail to create partitions for any of the tables
event :switch do
transition :ready => :current
end
event :deactivate do
transition :active => :inactive
end
before_transition :ready => :current do
Ci::Partition.current.active!
end
end
end
class Ci::Partition
THRESHOLD = 90.gigabytes # maybe app settings?
# From a daily cron worker
if Ci::Partitionable.registered_models.any? { |model| model.partitioning_strategy.active_partition.data_size > THRESHOLD }
Ci::Partition.next.switch!
end
end
class Ci::Partitionable
def self.registered_models
Gitlab::Database::Partitioning
.registered_models
.select { |model| model < Ci::ApplicationRecord && model < Ci::Partitionable }
end
end
See &11815 (comment 1780664103) for more details.
Edited by Max Orefice