Create design doc on CI Time Decay
Goal
Can we have GitLab admins defining a retention period after which pipeline data is removed?
Benefits
This will allow us to drop the number of records in CI database while improving reliability and having faster migrations.
Considerations
- The current CI partition strategy for CI tables is based on ID ranges today but it may change in the future. We cannot couple time decay strategies to how we partition tables.
- The Cells initiative is looking into splitting many tables (including CI database) into locally defined databases. Cell-local CI database will only contain data from organizations in that cell.
- We cannot simply drop old table partitions because we need to take care of artifacts deletion on object storage otherwise we will leave many orphan artifacts.
- Rate of CI builds created is growing every month on SaaS. The retention strategy must be performant to be able to delete records at the same rate of creation.
- Removal of table partitions can be challenging due to the interdependencies between tables caused by the cascading partition_id.
Edited by Fabio Pitino