Ensure ci_builds_metadata contains only processing data
Overview
ci_builds_metadata
currently contains many columns. We should ensure that data in all of them are safe to get deleted after a build gets archived.
We have work in-progress to clear data from some of the columns discussion: Delete Ci::BuildMetadata after Ci::... (#538031 - closed) which will help with space savings, but there's a decent amount of operational efficiency to be gained if we can simply delete the row, instead of updating a few columns ot null
.
Columns
Column | Type | Nullable? | Mutable? |
Removable on archive? #538031 (closed) |
Where to move? |
---|---|---|---|---|---|
id | integer | no | N/A |
|
N/A |
build_id | integer | no | N/A (we may deduplicate immutable data across builds) |
|
N/A |
project_id | integer | no | N/A |
|
N/A |
partition_id | integer | no | N/A |
|
N/A |
timeout | integer | yes | yes (when job picked by runner) |
|
|
timeout_source | integer | yes | yes (when job picked by runner) |
|
|
interruptible | boolean |
no (default |
no |
|
|
config_options | jsonb | yes | ? |
|
|
config_variables | jsonb | yes | no |
|
|
has_exposed_artifacts | boolean |
yes. We care if it's |
no |
|
|
environment_auto_stop_in | character varying(255) |
|
!194402 (closed) being moved to |
||
expanded_environment_name | character varying(255) | yes | no |
|
|
secrets | jsonb | yes | no |
|
|
id_tokens | jsonb | yes | no |
|
|
debug_trace_enabled | boolean |
no (default |
yes |
|
|
exit_code | smallint | yes | yes |
|
|
config_options
Top-level keys found in As of 2025-05-23:
[ gprd ] production> Ci::BuildMetadata.select(:config_options).last(300_000).flat_map { |md| md.config_options.keys }.uniq.sort
NOTE: Ideally intrinsic data should be moved to a table that best represents the data. However, due to urgency, we could introduce a column in p_ci_builds
that is nullable and not indexed. For example if artifacts:expose_as
is intrinsic data (non processing), we could introduce p_ci_builds.artifacts_expose_as
as jsonb and move the data in there when pipeline is archived or new jobs created.
Top-level key | Nullable? | Mutable? | Removable on archive? | Where to move? |
---|---|---|---|---|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
@fabiopitino: this should be considered intrinsic data. Consider creating a dedicated table given the low usage of this feature which may help us deprecating it if needed. Alternatively, if stored in |
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | yes |
|
|
|
yes | yes |
|
Moving to Redis |
|
yes | |||
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes |
|
||
|
yes |
|
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes |
Likely isn't used after the release is created. See #545486 (comment 2547683632). |
||
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|
|
yes | no |
|
|