Discuss data retention strategy for CI data

As @drew pointed out in gitlab#407821 (comment 1365517761) in discussing our current data retention strategy for Artifacts and Jobs, I thought we could discuss more broadly whether we need to revise our existing data retention strategy.

Some of the expected benefits:

Lowered storage costs by not needing to store older builds and their artifacts, and other related data (e.g. builds metadata)
Improved reliability when performing database operations against large tables

What needs to be considered / evaluated:

Pruning of large tables could be a challenge and have an impact on SaaS availability; however, this may be worth exploring now that CI tables are and will be partitioned into smaller tables
Legal implication of pruning large tables
Who would be impacted (e.g. freemium vs ultimate users)
How would a revised strategy be communicated and how much notice are customers provided

Edited Apr 24, 2023 by Cheryl Li