Transition non-processing data off of Ci::BuildMetadata
The Problem
We have some non-processing data stored on p_ci_builds_metadata that needs to be relocated so we can erase all ci_build_metadata records after a Pipeline is archived, regardless of the column values.
Why can't we leave data in-place on the table and just delete what we don't need?
Because of Postgres' operational details, leaving any records on the table will prevent reclamation of the disk space for reuse.
Maybe, but I'm a database engineer so please explain the details of this to me:
For our long-term retention process and self-managed instances:
- The automatic
VACUUM TRUNCATErequires fully empty pages at the end of the table, and the only way to reliably get the benefits of this function is to completely empty a partition. - A manual
TRUNCATErequires us to fully dispense with the whole table anyway. This plan allows the same result to be accomplished by the automaticVACUUM TRUNCATE.
For SaaS in the short term:
- If we are concerned about WAL overload, we can build these new models and populate them without deleting rows from
p_ci_builds_metadata. The combination of prioritized reads from other tables and lack of retries on old jobs/pipelines means that reads and writes on the oldp_ci_builds_metadatapartition will go to actually zero. At this point we can run a manual truncate before we enable the individual-row cleanup strategy relying on the automaticVACUMM TRUNCATE.
But doesn't Postgres reuse disk space?
Not since we partitioned the tables. There is a very low volume of inserts on old partitions, and once archival is enabled it should go to 0. We will not have any meaningful amount of disk reuse, we must use some kind of truncate to get space back.
A Two-Part Proposal
Executive Summary
- Add boolean and integer columns to
p_ci_buildsthat do not have meaningful storage, reading, or writing costs. - Create a new model
Ci::ArchivedBuildMetadatawith a very dense table consisting of boolean and integer columns (very low disk consumption) to enable guaranteed deletion fromp_ci_builds_metadataafter 1 month- This produces a very efficient data structure, with complete continuity of data preservation and service.
- This new model is a temporary data structure to transition old records, and will eventually be dropped.
- This strategy does not require any novel technology or extraordinary effort to be performed by the Database Operations team.
Part 1: Move some columns back onto ci_builds (Address growth rate)
Yes, add to the Builds table. The columns in question are integers and booleans, so the actual disk space consumed by these records will be very small. Also, the data we want to move onto the table is not indexed or queried by. It is only accessed when we've loaded the full record for display, so the additional overhead here is minimal.
Columns
| Column | Type | nullable | Indexed/Queried? | Long-term home |
|---|---|---|---|---|
| timeout | integer | No | p_ci_builds |
|
| timeout_source | integer | not null | No | p_ci_builds |
| exit_code | smallint | No | p_ci_builds |
For now, we will not backfill these columns. But by writing data to these columns for new jobs instead of using the Ci::BuildMetadata record, we will start creating completely deleteable records at the hot/recent end of the p_ci_builds_metadata table. All the new Ci::BuildMetadata records will contain purely processing data (intrinsic data column will be null/blank/false) and so they'll be deleteable by our very regular cleanup workers. This will reduce the database disk growth rate without addressing the very large older partitions.
There is also one column, has_exposed_artifacts, that is indexed and queried by. Since the data referenced by the index is Ci::JobArtifact data, we should attach it to those actual records in p_ci_job_artifacts. When the artifact expires, it's no longer exposed so we should no longer having a column saying it is. This is an operational efficiency we can gain by more appropriately locating the presence flag on the actual data whose presence matters.
debug_trace_enabled is a granular permission-elevation option that refers specifically to what data is exposed in job traces. This data point can be moved onto the trace artifact itself. If for some reason we don't want to do that, it's also possible for this data point to live on ci_build. The transition will be simpler, so this will be a judgement call for the implementing developers.
| Column | Type | nullable | Indexed/Queried? | Long-term home |
|---|---|---|---|---|
| has_exposed_artifacts | boolean | No (we can avoid index based on the access pattern) | p_ci_builds |
|
| debug_trace_enabled | boolean | not null | No |
p_ci_builds because for live traces (using chunks) we don't yet have artifact records. |
Part 2: Create an intermediate Ci::ArchivedBuildMetadata model
Columns
| Column | Type | nullable | Indexed/Queried? | Long-term home |
|---|---|---|---|---|
| timeout | integer | No | p_ci_builds |
|
| timeout_source | integer | not null | No | p_ci_builds |
| has_exposed_artifacts | boolean | Yes | p_ci_builds |
|
| debug_trace_enabled | boolean | not null | No | p_ci_builds |
| exit_code | smallint | No | p_ci_builds |
To be able to effectively clean up the old partitions of p_ci_build_metadata we need to remove every single row, for reasons discussed above in the Problem section. In order to maintain continuity of service, we need to store this data and be able to continuously query and render it.
Fortunately, the columns we need to preserve consume a very small proportion of the data stored in the p_ci_builds_metadata row. In order to delete a whole metadata row, we can copy these five integer and boolean columns to a new table. This table:
- Should be prioritized for reads ahead of
p_ci_builds_metadata. This new table will be dense, indexed, small, and efficient. We'd like to reduce traffic top_ci_build_metadatapartitions as much as possible. - Should be prioritized for reads behind
p_ci_buildsandp_ci_job_artifactswith the moved columns. Since we want the data to live on those tables going forward, we'll be writing to those new columns on existing tables for new records and we should read from that long term SoT when possible. - Should check to see if the data is already persisted on
p_ci_buildsandp_ci_job_artifacts, and if so do not create a new record. This will prevent us from unnecessarily creatingCi::ArchivedBuildMetadatarecords for new jobs that don't need them. - Should not be partitioned. It's not worth the trouble. As we create this table, we know that it's a transitional data model that we don't want to integrate into our permanent CI data life cycle. It's only for dealing with historical data; going forward this data will be written to the existing and partitioned
p_ci_buildsandp_ci_job_artifactstables.
Isn't this just a pg_repack? Why are we bothering to create a new model?
Unfortunately, we render timeout for builds in the API. While this data may be used infrequently, making it part of an API response represents a promise that we can only break by going through Product and UX channels to deprecate the field. This work is ongoing, but the current need to offload CI data is working on a much shorter timeline. We do not have months to wait for clearance and a transition process. We can still drop the data later, but probably not as part of this effort.
Needing to preserve the timeout value means that we could probably only delete about 35% of Ci::BuildsMetadata, as opposed to ~94% if we deleted it. The more data we have to leave in place, the more difficult the repack.
Nullifying the columns on records we need to preserve was also considered, but won't be practically helpful
- If there are lots of rows (65%) that we can nullify the large TOAST columns of (rewrite the tuple), it would be rewriting new tuples to the old partition for 65% of the table and not actually reclaiming old disk space. This would increase the footprint.
- If we mostly delete everything and there are relatively few records (5%) to nullify, it would only be rewriting a few new tuples but also there wouldn't be very much data left anyway so the space savings wouldn't be too significant.