Track the count of tables of 10 GB+, warn if approaching

  • https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/database_size_limits/
  • What matters is that the number of such big tables are tracked.
  • Tables approaching the 100 GB should creating a warning (say at 50GB). We will work with the owning teams to reduce the table size growth
  • Tables already exceeding the 100 GB will be allowlisted for now.

Action items

  • Update design document with ranges (10, 50, 100 GB) to classify as small, medium, large, and over_limit. See #477398 (comment 2130072487)
  • Add new table_size: in table data dictionary.
  • Create a semi-manual process (call a script), that will update table_size: for all tables using a suitable data source. I started with daily-database-table-size.json but maybe a more up-to-date data source could be found.
  • There also needs to another script to update Migration/UpdateLargeTable, LargeTables. We can follow this up.
  • Update Migration/PreventIndexCreation for pre-existing migrations
  • Update Migration/AddColumnsToWideTables for pre-existing migrations
  • Prevent new index creation for large, and over_limit tables.
  • Prevent new column addition for large, and over_limit tables.
  • Track count of tables > 100 GB: 63
  • Track count of tables > 50 GB: 27
  • Add table size monitoring to Tamland similar to int4 monitoring
Edited Nov 29, 2024 by Max Orefice
Assignee Loading
Time tracking Loading