Skip to content

Discuss data retention strategy for CI data

As @drew pointed out in gitlab#407821 (comment 1365517761) in discussing our current data retention strategy for Artifacts and Jobs, I thought we could discuss more broadly whether we need to revise our existing data retention strategy.

Some of the expected benefits:

  • Lowered storage costs by not needing to store older builds and their artifacts, and other related data (e.g. builds metadata)
  • Improved reliability when performing database operations against large tables

What needs to be considered / evaluated:

  • Pruning of large tables could be a challenge and have an impact on SaaS availability; however, this may be worth exploring now that CI tables are and will be partitioned into smaller tables
  • Legal implication of pruning large tables
  • Who would be impacted (e.g. freemium vs ultimate users)
  • How would a revised strategy be communicated and how much notice are customers provided
Edited by Cheryl Li